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(54)1106: COMPARATIVE GENE TRANSCRIPT ANALYSIS 
(57) Abstract 



A method and system for quantifying the relative abundance of gene transcripts in a biological specimen. One embodiment of the 
method generates high-throughput sequence-specific analysis of multiple RNAs or tiieir corresponding cDNAs (gene transcript imaging 
analysis). Another embodiment of the method produces a gene transcript imaging analysis by the use of high-throughput cDNA sequence 
an^ysis. In addition, the gene transcript imaging can be used to detect or diagnose a particular biological stale, disease, or condition 
which is correlated to the relative abundance of gene transcripts in a given cell or population of cells. Hie invention provides a metiiod 
for comparing the gene transcript image analysis from two or more different biological specimens in order to distinguish between tiie two 
specimens and identify one ot more genes which are differentially expressed between the two specimens. . 
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COHPARATIVPB. OENE TRANSCRIPT ANALYSIS 

]^-^ ■'^''■•■'•■^ - FIELD OF INVENTION 

' The present invention is in ithe field of molecular 
biology^ >and computer: science ; more paarticularly , . the 
5 present Invention describes methods of, analyzing gene , 
transcripts :and. diagnosing the genetic expression of cells 
and tissue. . , . 

5 J . 2. BACKGROUND OF THE INVENTION 

^. Until very recently, the vhistory, of molecular biology 
10 has been written one gene, at ; a time. , Scientists have 
observed the cell ',s. physical ; changes isolated . mixtures 
from the cell or its milieu > purified proteins, sequenced 
proteins and. therefrom constructed probes to , look for the 
- corresponding gene. . 

15 Recently, different nations have set up massive 

projects to sequence the billions of bases in the human 
genome. These projects typically begin with dividing, the 
genome into . large . portions of chromosomes and^then 
determining the sequences of these, pieces, .which are then 

20 analyzed for. identity with known proteins or portions 

thereof , i known as motifs. Unfortunately, the majority of 
genomic DNA does not encode proteins and though it is 
postulated to, have some effect on the cell's ability to 
make protein, its relevance to medical applications , is not 

25 understood at this time. 

A third methodology involves sequencing only the 
transcripts encoding the cellular machinery actively 
involved in making protein, namely the mRNA. The advantage 
is that the cell has already edited out all the non-coding 

30 DNA, and it is relatively easy to identify the protein- 
coding portion of the RNA. The utility of this approach 
was not immediately obvious to genomic researchers. In 
fact, when cDNA sequencing was initially proposed, the 
method was roundly denounced by those committed to genomic 

35 sequencing. For example, the head of the U.S. Human Genome 
project discounted CDNA sequencing as not valuable and 
refused to approve funding of projects. 

In this disclosure, we teach methods for analyzing 
DNA, including cDNA libraries. Based on our analyses and 
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research, we ^ see each individual gene product as a "pixel" 
of information, which relates to the expression of that, 
and .only that, gene. We teach herein, methods' whereby the 
individual Vpixels" of .gene expression inf ormation can be 
5 combined ii>to a singler gene transcript ?? image, " in which 
each of the individual , genes can be visualized 
simultaneously and allpwing relationships ^between the gene 
pixels, .to be easily visualized and understood. 

, - We fyxrther teach: a new ^method which 'we call electronic 
subtraction.-^ Electronic subtraction will enable the gene 
researcher to turn a single image into a imovihg picture, 
one which describes, the^; temporality or dynamics of gene 
expression, at the level of -a cell or a whole tissue* It 
is that sense of "motion" of cellular machinery on the 
15 scale of a cell or organ which constitutes the new 

invention herein. This constitutes a^ new view into the 
process of living. cell physiology and one ^ which holds great 
promise to unveil and discover new therapeutic and ^ 
diagnostic approaches . in medicine. 
20 We teach another method, which we call "electronic . - 

northern," which tracks the expression of a single gene - 
across many types of cells and tissues. 

Nucleic acids (DNA and RNA) carry within their ■ 
sequence the hereditary information and are therefore the 
25 prime molecules of life. , Nucleic acids are found. in all 
living organisms including bacteria, fungi, viruses; plants 
and animals . it is of interest to determine the relative 
abundance of different discrete nucleic acids in different 
cells, tissues and organisms over time under various . ■ 
30 conditions, treatments and regimes. 

All dividing cells in the human body contain the same 
set of 23 pairs of chromosomes, it is estimated that these 
autosomal and sex chromosomes encode approximately 100,000 
genes. The differences among different types of cells are 
35 believed to reflect the differential expression of the 
100,000 or so genes. Fundamental questions of biology 
could be answered by understanding which genes are 
transcribed and knowing the relative abundance of 
transcripts in different cells. 
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Previously, the art has only provided- for the analysis 
of a few knovm genes at a time by standard molecular, -4 
biology techniques such as PGR,, northern .blot analysis^ -.or,i 
other types of DNA^ probe analysis ^ such as Jji situ . ^ 
5 hybridization* I Each -of . these methods allows one ,to„! analyze 
the transcription of only known jgenes and/ or small numbers 
of genes at a time. Nucl ; Acids; .Res . 19 .. 7097-7104 (1991); 

Nucl. Acids. Res.: IB, 4833-42 (1990) ; . Nucl. Acids Res. 18.. 
2789-92 (1989) ; -European »X. . Neuroscience ; .a, 1063^1073 , * 
10 (1990) Analytical Biochem. 182,^^ 364-73 X1990) ;, Genet.-. .. . : 
Annals Techn. Appl. Z, 64-70 '(1990) ;: GATA 8 (4)., 129-33 , „ 
(1991) ; Procv Natl. Acad. Sci. USA 85, 1696^1700 .(1988) ; 
Nucl. Acids , Res. -12/ 1954 (1991).;M>roc. .Natl.; Acad. Sci.. 
USA Mr 1943-47 (1991) ; Nucl. jAcids Res.. ,19/ 6123-r27, ; ; 
15 (199.1) ; Proc. Natl Acad . Sci . USA j^,. r5738-42 r(1988.) ;- 
Nucl. Acids .Res> 16, 10937 ;;(1988) . w . : . 

Studies of the number and types of genes whose 
transcription is induced or otherwise regulated during cell 
processes such as activation, differentiation, aging, viral 

20 transformation, morphogenesis, and mitosis have been 

pursued for many years, using a variety of methodologies. 
One of the earliest methods was to isolate and analyze 
levels of the proteins in ^a cell, tissue, organ . system,, i. or 
even organisms both before and after the process of 

25 interest. One .method of analyzing multiple rproteins in a 
sample is using 2Tdimensional gel electrophoresis, wherein 
proteins can be, in principle, identified and quantified as 
individual bands, and ultimately reduced to a discrete 
signal. At present, 2-dimensional analysis only resolves 

30 approximately 15% of the proteins, in order to positively 
analyze those bands which are resolved, each band must be 
excised from the membrane and subjected to protein sequence 
analysis using Edman degradation. Unfortunately, most of 
the bands were present in quantities too small to obtain a 

35 reliable sequence, and many of those bands contained more 
than one discrete protein. An additional difficulty is 
that many of the proteins were blocked at the 
amino-terminus, further complicating the sequencing 
process. 
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Analyzing differentiation at the gene transcription^^ 
l^yel has oyercome many of these disadvantages and : . v. 
.drawbacks, since the power of recombinant DNA. technology 
allows amplification of signals, containing very small 
5 .amounts of material. : The f most common method, called 

••hybridization .subtraction,^' involves isolation of mRNA ^ 
from, the biologicjal. specimen before (B) and after (A) the 
developmental process of , interest, transcribing one set of 
mRNA into cDNA,, , subtract ingr.specimeniB^from specimen A 
10 (mlWA from ,cDNA) by. hybridization, and. constructing: a cDNA 
library from the nonrhybridizing- mRNA fraction; : Many \ 
different groups , haye , used this strategy > successfully ; and 
a variety of procedures have been published and improved 
upon using, this same basic- scheme, - Nucl . Acids Res : 19, 
15 70?7r7104 (1991);;. Nucl. . Acids >Res. 18 , 4833-42 r (1990) ; c 
• Nucl. Acids. Res. iS, 2789-92 .(1989.);; European ij. . ; 
Neuroscience . 2 , 1063-1073 (1990); Analytical Biochem. 187 :^ 
364-73. (19.90) ; Genet. Annals Techn. Appl. ^2, 64-70 (1990); 
GATA 8 (4), 129-33 (199.1) ; Proc. Natl. Acad. -Sci., USA 85 , ^ . 
20 1696-1700 (1988) ;^Nucl. Acids Res. 19, .1954 (1991) ; Proc. 
Natl. Acad. Sci. USA 88 , 1943-47 ,(1991) ; Nucl. Acids Res. 
19, 6123-27 (1991) ; Proc, Natl. Acad. Sci. USA aS, 5738-42 
(1988); Nucl. Acids Res. 16, 10937 (1988). 

Although, each of these techniques have particular • 
25 strengths and weaknesses, there are still some limitations 
and undesirable aspects of these methods: First, the time 
and effort required to construct such libraries is quite 
large. Typically, a trained molecular biologist might 
expect construction and characterization of such a library 
30 to require 3 to 6 months, depending on the level of skill, 
experience, and luck. Second, the resulting subtraction 
libraries are typically inferior. to the libraries 
constructed by standard methodology. A typical 
conventional cDNA library should have a clone complexity of 
35 at least 10** clones, and an average insert size of 1-3 kB. 
In contrast, subtracted libraries can have complexities of 
10^ or 10^ and average insert sizes of 0.2 kB. Therefore, 
there can be a significant loss of clone and sequence 
information associated with such libraries. Third, this 
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approach allows the researcher to capture only the genes 
Induced in specimen A relative to specimen not r 
vice-versa, nor does it easily allow comparison to a third 
specimen of interest i^'CC) . -?*F^ recjuires 
5 very large amounts (hundreds of vmicrograms) of "driver" 
mRNA (^specimen B) , '^ich significantly limits the ' numbers 
and type of subtractions that are possible since many 
tissues and /Cells are -very diff icult to obtain in large 
quantities:. '•^■'v- -.f : . o 

10 > Fifth, the resolution of ^ the subtraction is dependent 
^ upon the physical properties of "dnAzDNA or RNA:DNA ' 
hybridization. The ability of a given sequence to ; find a 
hybridization match is dependent 'on its ^ unique .CoT value. 
The CoT value is a function of the number -of .copies: ( 

15 (concentration) of the particular sequence, multiplied by 
the time of hybridization. It follows that for sequences 
which are abundant/ hybridization events will occur very c /. 
rapidly (low Cot value) > while rare sequences will form :in ^ 
duplexes at very high CoT values. CoT values which allow 

20 such rare sequences to form duplexes and .therefore be . . 
effectively selected are difficult to achieve in a : 
convenient time frame. Therefore, hybridization 
sxibtraction is simply not a useful technique with which, to 
study relative levels of rare mRNA species. Sixth, this 

25 problem is further complicated by the fact that duplex 

formation is also dependent on the nucleotide- base : • - ' 
composition for a given sequence. Those sequences rich in 
G + C f orm stronger duplexes than those with high contents 
of A + T. Therefore, the former sequences will tend to be 

30 removed selectively by hybridization subtraction. Seventh, 
it is ^possible that hybridization between nonexact matches 
can occur. When this happens, the expression of a 
homologous gene may "mask" expression of a gene of 
interest, artificially skewing the results for that 

35 particular gene. 

Matsubara and Okubo proposed using partial cDNA 
sequences to establish expression profiles of genes which 
could be used in functional analyses of the human genome. 
Matsubara and Okubo warned against using random priming, as 
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it creates multiple unique DNA fragments from individual 
i'"^^? r^?**^ may thus skew, the analysis of the number of 
particular inRNAs , per. library. , They sequenced randomly 
selected members from a 3 '-directed cDNA library and 
5 established: tjie frequency, of appearance of the various 
ESTs. They... proposed comparing lists of ESTs- from various 
cell types to classify, genes.: Genes expressed in many- ^ ' 
different cell, types were labeled 'housekeepers and those ^ 
selectively expressed in certain cells were labeled cell- 

10 specific genes, even in the absence of the full sequence of 
the gene or the Mological activity of .the^ gene product. 

.The present invention avoids the drawbacks of the 
prior ar$, by -providing , a method to - quantify the - relative 
abtindance^ of multiple gene transcripts in a given 

15 biological specimen by , the use of highr-throughput^ ^ ). 

sequence-specific analysis of -individual RNAs and/or their 
corresponding cDNAs . . , 

The present invention offers several advantages bver ■ 
current protein discoyery methods which attempt to isolate 
20 individual proteins, based upon biological effects . The 
method of the instant invention provides for. detailed 
diagnostic comparisons of cell profiles revealing nvimerous 
changes in the expression of individual transcripts. 

The instant invention provides several advantages over 
25 current subtraction methods including a more complex 
library analysis (lof.to lo' clones as compared to lo^ 
clones) which allows identification of low abundance 
messages as well as enabling the identification of messages 
which either increase or decrease in abundance. These 
large libraries are very routine to make in contrast to the 
libraries of previous methods. In addition, homologues can 
easily be distinguished with the method of the instant 
invention. 

This method is very convenient because it organizes a 
35 large quantity of data into a comprehensible, digestible 
format. The most significant differences are highlighted 
by electronic subtraction. In depth analyses are made more 
convenient . 
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The present invention i provides several : advantages over 
previous methods of electronic ^analysis of cDNA, v The 
method is particularly 'powerful 'when more ^than ibo^and 
pref er ably mbre -than 1,000 gene transcripts are analyzed. 
5 In such a i case, new. ilow-frequency transcripts are : . 

discovered ^ and tissue '^typed. f , v , : . , ^ \^ 

;High resolution analysis of gene expression can.be .<> 
used directly as a diagnostic profile or to identify 
disease-specif ic genes :f or the development of more classic 
10 diagnostic approaches ..-^ rr:, j.: j 

: This process 'is 'defined as gene transcript frequency 
analysis. The resulting quantitative analysis, of. the gene 
transcripts' is defined vasi comparative gene transcript-: 

ana-lysis.^ •rrtr.:^.. -^V\i :^ : J^^u.^^iLi ; j 

15 3, SUMMARY OF THE INVENTION , , . 1; 

The invention is a. method of analyzing a specimen 
containing gene transcripts comprising the steps of (a) 
producing a library of biological sequences; (b) generating 
a set of transcript sequences where each , of the transcript 

20 sequences in said set is indicative of a different vOne . of r 
the biological sequences of the library;, (c).- processing the 
transcript sequences in a programmed computer (in which a 
database of reference transcript sequences indicative of 
reference sequences is stored) , to generate an identified 

25 secpience value for each of the transcript sequences/ where 
each said identified sequence value is indicative of 
sequence annotation and a degree of match between one of 
the biological sequences of the library and at least one of 
the reference sequences; and (d) processing each said 

30 identified sequence value to generate final data values. 

indicative of the number of times each identified sequence 
value is present in the library. 

The invention also includes a method of comparing two 
specimens containing gene transcripts. The first specimen 

35 is processed as described above. The second specimen is 
used to produce a second library of biological sequences, 
which is used to generate a second set of transcript 
sequences, where each of the transcript sequences in the 
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second set is indicative of one of the biological sequences 
of the second library. _ Then the. second set of transcript 
sequences . is . procegifijed in a programmed icomputer to generate 
a -second vset of . identified, sequence values, namely the 
5 further identified sequence, values , reach . of which is 

indicative of a sequence annotation ;and . includes a degree 
of match between one of the biological sequences cof the 
second ilibrary and at least one of » the reference sequences. 
The; further identified sequence values are processed to 
10 generate .further final >data values indicative; :of the number 
of times each further identified sequence value is present 
in the second library. ; The final, idata values, from. the ; . 
first specimen and the further identified sequence values 
from, the second . specimen are processed to. .generate ratios:,.; 
15 of transcript sequences, which indicate the differences in 
the number of gene transcripts between ..the two specimens. , 

In a further embodiment, the method includes . 
quantifying the relative abundance. of mRNA in a biological 
specimen by (a) . isolating, a population of mRNA transcripts 
20 from a biological specimen; (b) identifying genes from 
which the mRNA was transcribed by a sequence-specific 
method; (c) determining the. numbers of. mRNA transcripts-: 
corresponding to each of the genes; and (d) using the mRNA 
transcript numbers, to determine, the relative abundance of 
25 mRNA transcripts within the population of mRNA transcripts. 
Also disclosed is a method of producing a gene - 
transcript image analysis by first obtaining a mixture of 
mRNA, from which cDNA copies are made. The cDNA is 
inserted into a suitable vector which is used to transfect 
30 suitable host strain cells which are plated out and 

permitted to grow into clones, each cone representing a 
unique mRNA. A representative , population of clones 
transfected with cDNA is isolated. Each clone in the 
population is identified by a sequence-specific method 
35 which identifies the gene from which the unique mRNA was 
transcribed. The number of times each gene is identified 
to a clone is determined to evaluate gene transcript 
abundance. The genes and their abundances are listed in 
order of abundance to produce a gene transcript image. 
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In a further embodiment," the; xelativenabundance. of the 
gene transcripts in one ' cell type . or tissue, .is compared . . 
with the relative abundance of gene transcript numbers. > in a 
second cell type or tissue in order to identify . the . ; 
5"^ differences and' similarities . r.u , ? i , - ^ . , r .v ^ , 

' In a further : embodiment, the method includes vra.jSysteia 
for analyzing a library of biological ^seguences including a 
means for receiving a set of transcript seguences, where 
each of the transcript sequences is indicative of; a-y , ; t ,: 

10 different one' of. the ^biological sequences of: .the, .library; . 
and a means for processing the transcript sequences > in a 
computer system iin which a database ^of reference transcript 
sequences indicative of reference . sequences. is -stored,, 
wherein the computer^ is programmed with software i for . 

15 generating an identif ied sequence value ifor each ; of the 

transcript sequences, where : each said identified .sequence , 
value is indicative of a sequence. : annotation .and , the degree 
of match between a different one of the -biological , 
sequences of the library and at least one, of the r^eference 

20 sequences, and for . processing each said identified sequence 
value to geneirate final data values indicative of the . ^ 
number of times each identified sequence value is present 
in the library. c 

■ In essence, the invention is; a method and system for 

25 quantifying the relative abundance ^of gene transcripts in a 
biological specimen. The invention provides a method for 
comparing the gene transcript image from two or more 
different biological specimens in order to distinguish 
between the two specimens and identify one or more genes 

30 which are differentially expressed between the two 
specimens. Thus, this gene transcript image and its 
comparison can be used as a diagnostic* One embodiment of 
the method generates high*throughput sequence-specific 
analysis of multiple RNAs or their corresponding cDNAs: a 

35 gene transcript image. Another embodiment of the methods 
produces the gene transcript imaging analysis by the use of 
high-throughput cDNA sequence analysis. In addition, two 
or more gene transcript images can be compared and used to 
detect or diagnose a particular biological state, disease. 
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or condition which ds correlated to. the relative abundance 
,of , gene transcripts in a given cell or population of > cells. 

^: 4,. . DESCRIPTION OF THE TABLES AND DRAWINGS . 

■"C^ ^.f»- , " .v-t ' .1 ^ .-;.i4..1*l- TABLES : . ' o-. :: • 

5 i . ) Table 1 presents a detailed explanation of the letter 
codes utilized in Tables 2-5* ? 

I - Table 2 lists the) one hundred most common gene > 
transcripts .o . It is a partial list., of isolates from the r ? 
i HUVECcDNA library prepared, and sequenced as described 
10 below. . The left-hand column refers to the. sequence 'is order 
of abundance this table.^. The. next* column^ labeled o 
"number" is the clone. number of the first HUVEC: sequence 
identification reference matching the: sequence in the: r i ,r 
"entry" column number. Isolates that have not been 
15 sequenced are. not present in Table 2. The next column, 

labeled "N", indicates the total number of cDNAs which have 
the same degree of match with the sequence of the. reference 
transcript in the "entry" column. - 

The column labeled "entry" gives the NIH GENBANK locus 
20 name, which, corresponds to the - library sequence numbers. . - 
The "s" column indicates in a few cases the species of the 
reference sequence. The code for column "s" is given in 
Table 1. The column . labeled "descriptor" provides a plain 
English explanation of the identity of the sequence 
25 corresponding . to the NIH GENBANK locus name in the "entry" 
column, i . , 

Table 3 is a comparison of the top fifteen most 
abundant gene transcripts in normal monocytes and activated 
macrophage cells. 

30 Table 4 is a detailed summary of library subtraction 

analysis summary comparing the THP-l and human macrophage 
cDNA sequences. In Table 4, the same code as in Table 2 is 
used. Additional columns are for "bgfreq" (abundance 
number in the subtract ant library) , "rfend" (abundance 

35 number in the target library) and "ratio" (the target 
abundance number divided by the . subtractant abundance 
number). As is clear from perusal of the table, when the 
abundance number in the subtractant library is "0", the 

10 
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target abundance number is divided^ 'by 0; 05. 'This is 'a way 
of 'obtaining a result (not possible dividing by 0) and 
distinguishing the result from ratios of • 'subtractant * ^ 
numbers ^of* l;'-* ■ * "• • ^ ■ ■ i:, r-i* 

5 • Table * 5- ±s the computer pi-ogram, ^titten in source 
code;"-for generating gene transcript subtraction profiles. 

• Table 6 is a partial ' listing of database entries used 
in the electronic northern blot analysis as provided by the 
present invention^ "^'^ v • --^ , i.; • <, ..-^ 

4 i 2 . BRIEF DESCRIPTION OP THE DRAWIKGS 

Ficnire l is a chart summarizing data collected and 
stored regarding the library construction portion of ^ 
sequence preparation and analysis . ' ; 
15 Ficmre 2 is a diagram riepresenting the sequence of ^ 

operations performed by "abundance sort" software in a 
class of preferred embodiments of the inventive method. 

Ficmre 3 is a block diagram of a preferred embodiment 
of the system of the invention. 
20 ' Ficmre 4 is a more detailed block diagram of the : 

bioinf ormatics process from new sequence (that has'^ already 
been sequenced but not identified) to printout of the 
transcript imaging analysis and the provision of database 

subscriptions. 

■' ~ • ' . ' ■ ' ' ■ ' • ■' ' ' • . . * - • ' - . " 

25 5. DETAILED DESCRIPTION OF THE INVENTION 

The presient invention provides a method to compare the 
relativiB abundance of gene transcripts in different 
biological specimens by the use of high-throughput 
sequence-specific analysis of individual RNAs or their 

30 corresponding cDNAs (or alternatively, of data representing 
other biological sequences) • This process is denoted 
herein as gene transcript imaging. The quantitative 
analysis of the relative abundance for a set of gene 
transcripts is denoted herein as "gene transcript image 

35 analysis" or "gene transcript frequency analysis". The 
present invention allows one to obtain a profile for gene 
transcription in any given population of cells or tissue 
from any type of organism. The invention can be applied to 
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obtain a profile of « a specimen consisting of a single cell 
, Cqr clones of a :Single .cell) , ;or , of .many cells, or; of \: 
: ti3sue., more, complex than a single cell and containing v 
multiple -cell types, such las liver. ^ , . . . .. ; 

v5 .i.: .;:. The Invention has significant advantages .in the fields 
of diagnostics, toxicology and pharmacology, to namei a-ifev. 
A highly sophisticated diagnostic -test can be performed on 
the ill patient in whom a ^ diagnosis has not been made. A 
biological specimen consisting of ^the patient's fluids or 

ilO tissues is ..obtained, and.nthe .gene transcripts ; are isolated 
and expanded to , the extent . necessary to determine their 
identity. . Optionally, the gene transcripts .can t be . 
converted to cDNA. A sampling .of the tgene . transcripts are 
subjected to sequence-tspecific analysis and quantified. - 

15 These gene transcript sequence abundances are compared 

against reference database : sequence abundances .including,- 
normal data sets for .diseased .and .healthy, patients., The f * 
patient has the disease (s) , with which the patient's data v 
set most closely correlates. . 

20 For example, gene transcript .frequency analysis can. be 

used to differentiate normal cells or tissues from diseased 
cells or tissues, just, as it highlights differences between 
normal monocytes and activated , macrophages in Table 3. 

In toxicology, a fundamental question is which tests 

25 are most effective in predicting or detecting a toxic 

effect. Gene transcript imaging provides highly detailed 
information on the cell ,and tissue environment, some of 
which would not be obvious in conventional, less detailed 
screening methods. The gene transcript image is a more 

30 powerful method to predict drug toxicity and efficacy. 
-Similar -benefits accrue in the use of this tool in 
pharmacology. The gene transcript image can be used 
selectively to look at protein categories which are 
expected to be affected, for example, enzymes which 

35 detoxify toxins. 

In an alternative embodiment, comparative gene 
transcript frequency analysis is used to differentiate 
between cancer cells which respond to anti-cancer agents 
and those which do not respond. Examples of anti-cancer 
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agents are taiaox it en / vincristine/ vinblastine, ^ . 6 

podophyllbtoxinsv etoposide, tenispbsidey cisplatin, - ^ 1-' 
biologic response modifiers such' as interferon V 11-2, GM- 
<:SF,. enzymes, hdrmones arid^ the like . v This method -also ^ 
5 providiBS a means' for sorting ;the giene transcripts by ^ 
afunctional category. In :the case of *cancer cells, 
transcription 'factors or other essential regulatory ' 
molecules are very important categories to analyze across 
different libraries. 

1^ In yet another vexribodiment, compara^ transcript 

' ' frequency analysis is used to -differentiate between control 
liver cells and liver cells isolated from patients ; treated 
with experimental drugs like FIAU " to distinguish between 
pathology caused by the :under lying disease and that caused 
15 by the drug.' ^ (-^^ ■ ; , -. j.. . , 

In yet another embodiment, comparative gene transcript 
frequency analysis is used to differentiate between brain 
tissue from patients treated and untreated with lithium; 

In a further embodiment, comparative gene transcript 
20 frequency analysis is used to differentiate between . 
cyclosporin and FK506-treated cells and normal cells. 

In a further embodiment, comparative gene transcript 
frequency analysis is used to differentiate between virally 
infected (including HIV-infected) human ceils and . , - 
25 xohinfected human cells. Gene transcript frequency analysis 
is also used to rapidly survey gene transcripts in HIV-' 
resistant, HIV-infected, and HIV-sensitive cells. 
Comparison of gene transcript abundance will indicate the ' 
success of treatment and/ or new avenues to study. 
30 In a further embodiment, comparative gene transcript 

frequency analysis is used to differentiate between 
bronchial lavage fluids from healthy and unhealthy patients 
with a variety of ailments. 

In a further embodiment, comparative gene transcript 
35 frequency analysis is used to differentiate between cell, 
plant, microbial and animal mutants and wild-type species. 
In addition, the transcript abundance program is adapted to 
permit the scientist to evaluate the transcription of one 
gene in many different tissues. Such comparisons could 
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identify deletion mutants: which do ^not ' produce a gene 
product^ and point mutants which produce a less, abundant. ;or 
otherwise, different message . Such mutations ; can- affect 

.basic biochemical: and pharmacological .processes, such as 
'■5 mineral nutrition, and^metabolism/ andi. can be isolated by 
means known to those skilled in the art. .Thus; « crops with 
improved yields/ pest resistance andc other factors can be 

''•developed. ^-'-.-^ - : 

i In a further embodiment, - comparative gene. transcript 
■ 10 frequency- analysis is used for anointerspecies comparative 
analysis which would allow for the selection of betterv^ 
pharmacologic aniiaal . models . In= this embodiment, ^ humans 
and other animals - (such as a mouse) , or. their cultured ; ij- 
cells are treated with a specific test; agent. The relative 
15 sequence abundance of each cDNA population- is determined; 
If the animal test system is a good model, homologous genes 
in the animal cDNA population should change expression - 
similarly to those in human cells. If side effects are 
detected with the drug,* a detailed transcript ^abundance 
20 analysis will be performed to survey gene transcript 

changes. Models will thenjbe evaluated by comparing basic 
physiological changes. - ^ 

. In a further embodiment, comparative gene transcript 
frequency analysis is used in a clinical setting to give a 
25 highly detailed gene transcript profile of a patient's 
cells or tissue (for example, a blood sample) . in 
particular, gene transcript frequency analysis is used to 
give a high resolution gene expression profile of a 
diseased state or condition. 
30 In the preferred embodiment, the method utilizes 

high- throughput cDNA sequencing to identify specific 
transcripts of interest. The generated cDNA and deduced 
amino acid sequences are then extensively compared with 
GENBANK and other sequence data banks as described below. 
35 The method offers several advantages over current protein 
discovery by two-dimensional gel methods which try to 
identify individual proteins involved in a particular 
biological effect. Here, detailed comparisons of profiles 
of activated and inactive cells reveal numerous changes in 
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the expression .of individual ^transcripts. After it is 
deteirmined if ; the sequence is an -'-'exact " match, similar or 
a non-match , the sequence is entered j into -a database L • ^ > i ; - 
Next, the numbers of copies ' of cDNA corresponding to each 
5 gerie are tabulated.' ; Although this can be done slowly and 
arduously, if .at all, .by human hand from a printout of all 
entries, xa computer : program is a useful and rapid way to 
tabulate this information. . The .numbers of cDNA copies ii' = 
(optionally divided by the total number of sequences in the 
10 data set) provides a^pictur^ of the relative abundance ;of 
transcripts for each corresponding . gene. The list of 
represented genes • can then be sorted j by abundance in the : 
cDNA population.' A multitude of additional types of 
comparisons or dimensions are possible and are exemplified 

15 below.' ■ ■ -^'.'.-r.-r ^..-^ ^ 

' An alternate method of producing : a gene, transcript 
image .includes the steps of obtaining a mixture of . test v 
mRNA and providing a representative array of< unique probes 
whose sequences are complementary to at least some of the 

20 test mRNAs. Next, a fixed amount of the test mRNA is added 
to the arrayed probes. The test mRNA is incubated with the 
probes for a sufficient time to allow hybrids of the test 
mRNA and probes to form. The mRNA-probe hybrids are 
detected and the quantity determined. The hybrids are 

25 identified by their location in the probe array. The 
quantity of each hybrid is summed to give a population 
number. Each hybrid quantity is divided by the population 
number to provide a set of relative abundance data termed a 
grene transcript image analysis. 

30 6. EXAMPLES 

The examples below are provided to illustrate the 
subject invention. These examples are provided by way of 
illustration and are not included for the purpose of 
limiting the invention. 

35 6.1. TISSUE SOURCES AND CELL LINES 

For analysis with the computer program claimed herein, 
biological sequences can be obtained from virtually any 
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source. -Most popular jar e tissues obtained from the human 
' body, . Tissues can be- obtained . from, any organ of the body, 
any age donor, any abnormality ,or any immortalized/ cell 
' line. . Immortal ceil lines may be preferred in .some; i 
:5 1 instances because of -their purity of cell type; other 
-tissue samples invariably include mixed cell types. A 
special technique is available to take a single cell (for 
example, a brain cell) and harness the cellular machinery 
to .grow up sufficient cDNA for .sequencing by the .techniques 
10 sand analysis described -herein (cf. Ui.S. Patent ;Nosv 

5,021,335 and 5,168; 038, which are incorporated by r . . i^ .: 
, reference) The examples given herein utilized the 
following immortalized eel L lines: monocyte-like 17-937 
-cells, ^ activated macrophage-likeTHP-1 cells, induced w = 
15 vascular endothelial cells (HUVEC .cells) and mast cell-like 
■ HMC-l cells.. , i..^ . ■ ■ .. v^-;. . ^; , 

: . The .U-p937 cell line is a human vhistiocytic lymphoma 
cell line with monocyte characteristics,, established from 
malignant cells obtained from the pleural effusion .of a 
20 patient with diffuse histiocytic lymphoma (Sundstrom, C. 
an<? Nilsson, K. (1976) Int. j. Cancer 17:565) .. . U-937 is 
one of only a few human cell lines with the morphology,, 
cytochemistry, surface receptors and monpcyte-like 
characteristics of histiocytic cells. These cells can be 
25 induced to terminal monocytic differentiation and will, 
express new cell surface molecules when activated with 
supernatants from human mixed lymphocyte cultures. Upon 
this type of in vitro activation, the cells undergo 
morphological and functional changes, including 
30 augmentation of antibody-dependent cellular cytotoxicity 

(ADCC) against erythroid and tumor target cells (one of the 
principal functions of macrophages) . Activation of U-937 
cells with .phprbol 12-myristate 13-acetate (PMA) in vitro 
stimulates the production of several compounds, including 
35 prostaglandins, leukotrienes and platelet-activating factor 
(PAF) , which are potent inflammatory mediators. Thus, U- 
937 is a cell line that is well suited for the 
identification and isolation of gene transcripts associated 
with normal monocytes. 
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The HUVEC cell line ds a normal, homo well 
charactierized, early passage endothelial cell culture from^^ 
human umbilical vein (Cell ' Systems Corp. , -12815 NE 124th 
Streist, Kirkland; WA 98034) . Only gene transcripts from 
5 induced/ or -treated; HUVEC 'cells were ■ seguenced. - One batch 
of I'X 10* cells was treated for 5 hours with 1 U/ml rIL-lb 
arid 100 ng/mi - E>coli iipopolysaccharide ^ ^(LPS) endotoxin 
prior to harvesting; A separate batch of 2'X 10* cells was 
treated at confluence with 4 U/ml TNF and 2 U/ml 

10 interferon-ganona (rFN*gamma) prior to harvesting. / - - 
THP-1 is a human leukemic cell line with distinct 
monocytic characteristics v This - c^ll lines was -derived from 
the ' blood of a ' l^year-old boy With acute monocytic leukemia 
(Tsuchiya, -S. et al: (1980) Iht; J. Cancer: 171-76) . The ' 

15 following cytological arid cytochemical criteria were ^used 
to determine the monocytic nature of the cell line: 1) ' the 
presence of alpha-naphthyl butyrate esterase activity which 
could be inhibited by sodium fluoride; 2) the production of 
lysozyme; 3) the phagocytosis of laitex particles and ' 

20 sensitized SRBC (sheep: red blood cells) -and 4) the ability 
of mitomycin C-treatedTHP-1 cells to activate T-^ - 
lymphocytes following ConA (concanavalin A) treatment. ^ 
Mbrphologicatlly, the cytoplasm contained small' azurophilic 
granules and the nucleus was inderited and irregularly 

2i5 shaped with deep folds. The cell line had Fc and C3b 
receptors, probably functioning in phagocytosis. THP-1 
cells treated with the tumor promoter 12-o-tetradecanoyl- 
phorbol-i3 acetate (TPA) stop proliferating and 
differentiate into macrophage-like cells which mimic native 

30 monocyte-derived macrophages in several respects. 

Morphologically, as the cells change shape, the nucleus 
becomes more irregular and additional phagocytic vacuoles 
appear in the cytoplasm. The differentiated THP-l cells 
also exhibit an increased adherence to tissue culture 

35 plastic. 

HMC-1 cells (a human mast cell line) were established 
from the peripheral blood of a Mayo Clinic patient with 
mast cell leukemia (Leukemia Res. (1988) 12:345-55). The 
cultured cells looked similar to immature cloned murine 
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mast cells, contained histamine, and stained -positively for 
chloroacetate esterase, amino caproate esterase," eosinophil 
major basic protein (MBP) - and tryptase. » The HMC-:1 cells 
/have, however, JLost the aibility to synthesize normal igE 
^5 receptors. : HMC-lMcells also posseiss a 10; 16 translocation; 
present in cells initially collected toy - leukophoresis from 
the patient and not an' artifact of culturing. Thus, HMC-l 
cells are a good model for mast , cells. ^ . 

r - : 6>2 « CONSTRUCTION OF cDNA LIBRARIES ^ 

10 For inter-^library comparisons, the libraries must be 

prepared in -similar manners; Certain parameters appear to 
be particularly important to control. One such parameter 
is the method.^ of isolating mRNA It as important to use 
the same conditions to remove DNA and • heterogeneous nuclear 

15 RNA from comparison libraries, .Size fractionation of^ cDNA 
must be carefully controlled. The same* vector ^ preferably 
should be used for preparing libraries to be compared. At 
the very least, the same type of vector (e.g;; ^ : 
unidirectional vector) should be used to assure a valid 

20 comparison. A unidirectional vector may be preferred in 
order to more easily analyze the output. 

It is preferred to prime only with oligo dT 
unidirectional primer in order to obtain one only clone per 
mRNA transcript when obtaining cDNAs. However, it is 

25 recognized that employing a mixture of oligo dT and random 
primers can also be advantageous because such a mixture 
results in more sequence diversity when gene discovery also 
is a goal, similar effects can be obtained with DR2 
(Clontech) and HXLOX (US Biochemical) and also vectors from 

30 Invitrogen and Novagen. These vectors have two 

requirements. First., there must be primer sites for 
commercially . available primers such as T3 or M13 reverse 
primers. Second, the vector must accept inserts up to 10 

kB. ■ : r . .. 

35 It also is important that the clones be randomly 

sampled, and that a significant population of clones is 
used. Data have been generated with 5,000 clones; however, 
if very rare genes are to be obtained and/or their relative 
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abundance determined, as many las lOOyOOO clones from a 
single library may need to be-sampled. . -Size fractionation 
of cDNA also must be carefully controlledii Alternately^ 
plaques can be .selected> rather than clones - > ' a. . -ai-rMC 
5: . Besides ^the Uni-ZAP™ vector system by Stratagene 
disclosed below, iti is now believed .that other similarly 
unidirectional vectors also can, be used* For example,- it 
is believed that such vectors, include, but are not limited., 
to. DR2 i (Clontech) , and . HXLOX; (U.S. ' Biochemical) . . / r 

10^ Preferably, the; details of library construction (as 

shown in Figure 1) ..are collected and stored , in a database 
for later retrieval; relative: to; the sequences being . 
compared . ^Fig . 1 shows important information regarding > the 
library collaborator or * cell or cDNA suppl ier , 

15 pretreatment , > biological source , culture , mRNA preparation, 
• and CDNA construction. .Similarly detailed information 
about the other steps is beneficial in analyzing sequences . 
and libraries in depth. ^ , 

RNA must be harvested from cells and tissue samples 

20 and cDNA libraries are subsequently constructed. cDNA 

libraries can be constructed according to techniques known 
in the art. (See, for example, Maniatis, T.. et al.: (1982) 
Molecular Cloning, Cold Spring Harbor Laboratory, New 
York). CDNA libraries may also be purchased. The U-937 

25 CDNA library (catalog No. 937207) was obtained .from . 

Stratagene, Inc., 11099 M. Torrey Pines Rd., ;La Jolla, CA 
92037. " 

The THP-1 cDNA library was custom constructed by 
Stratagene from THP-1 cells cultured 48 hours with 100 nm 

30 TPA and 4 hours with 1 /xg/ml LPS. The human mast cell HMC- 
1 cDNA library was also custom constructed by Stratagene 
from cultured HMC-l cells. The HUVEC cDNA library was 
custom • constructed by Stratagene from two batches of 
induced HUVEC cells which were separately processed. 

35 .Essentially, all the libraries were prepared in the 

same manner. First, poly(A+)RNA (mRNA) was purified. For 
the U-937 and HMC-1 RNA, cDNA synthesis was only primed 
with oligo dT. For the THP-1 and HUVEC RNA, cDNA synthesis 
was primed separately with both oligo dT and random 
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hexamers, and tthe two vcDNA }libraries were treated . ' 
separately* .^Synthetic adaptor oligonucleotides were 
ligated onto cDNA .ends enabling its insertion; into . the Uni 
Zap™ vector system ^(Stratagene;)i,: allowing high efficiency ^ 
5, unidirectionalji.( sense orientation)* lambda, library j : . v 

cpristruction and the convenience of ^a plasmid ^system with:, 
bluerrwhite color selection to. detect iclones with cDNA j .i 
insertions. Finally, the: two libraries were ..combined into 
a:. single library by mixing equal numbers of bacteriophage. 

10 The libraries can be screened with either DNA vprobes 

or. antibody probes and the pBluescript® phagemid i: ; , r r 
(Stratagene>r.canrbe rapidly excised in vivo : > ,The phagemid 
allows the ^use of a plasmid, system tor easy^ dnsert : 
characterization, sequencing, .siter-directed mutagenesis, 

15 the creation -Of unidirectional deletions. and expression of 
fusion - proteins . . The custom-constructed library phage r. 
particles were infected Anto £,i_coii' host strain XLl-Blue®^ 
(Stratagene) , which has a high transformation efficiency., 
increasing the probability of obtaining rare, under- - . ; 

20 represented clones in the cDNA library. 

-? 6.3. leOLATION OF cDNA CLONES 

The phagemid forms of individual cDNA. clones were 
obtained by the in vivo excision process, in; which the host 
bacterial strain was coinf ected with both the lambda 

25 library phage and an fl helper phage. Proteins derived 

from both the library-containing phage and the helper phage 
nicked the lambda DNA, . initiated new DNA synthesis from 
defined sequences on the lambda target DNA and created a 
, smaller, single stranded circular phagemid DNA molecule 

30 that included all DNA sequences of the pBluescript® plasmid 
and the cDNA insert. The phagemid DNA was secreted from 
the cells and purified, then used to re-infect fresh host 
cells, where the double stranded phagemid DNA was produced. 
Because the phagemid carries the gene for beta-lactamase, 

35 the newly-transformed bacteria are selected on medium 
containing ampicillin. 

Phagemid DNA was purified using the Magic Minipreps™ 
DNA Purification System (Promega catalogue #A7100. Promega 
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Corp,/ 2800 woods Hollow Rdi, Madison, Wl. 53711) . This 
small-scale process' provides a simple and reliable- method - 
for lysing the bacterial cells and rapidly isolating 
purified phagemid DNA Using a proprietary DNA-bindihg^ f 
5- resin. The' DNA was eluted from the purification^ resin . 
already prepared for DNA sequencing and other analytical 
manipulations;^-' --^ - -'^ • -^i o v. ■ m. . 

Phagemid DNA was also purified using the QIAwell-8 ^ 
Plasmid Purification System from QIAGEN®^ DNA Purification 

10^ System (QIAGEN Inc./ 9259 Eton Ave., Chattsworth, CA 

91311) . This product line provides a convenient, rapid and 
reliable high-throughput method for lysing the bacterial 
cells and isolating highly purified' phagemid DNA using 
QIAGEN anion-exchange resin particles with EMPORfi^ membrane 

15 technology from 3M-in a multiwell format. The DNA was 

eluted from the purification resin already prepared for DNA 
sec[uencing and other analytical manipulations. 

An alternate method of purifying phagemid has recently 
become available. It utilizes the Miniprep Kit (Catalog 

20 No. 77468, available from Advanced* Genetic Technologies 
Corp., 19212 Orbit Drive, Gaithersburg, Maryland). This 
kit is in the 96-well format and provides enough reagents 
for 960 purifications. Each kit is provided with a 
recommended protocol/ which has been employed except for 

25 the following changes. First, the 96 wells are each filled 
with only 1 ml of sterile terrific broth with carbenicillin 
at 25 mg/L and glycerol at 0.4%. After the wells are 
inoculated^ the bacteria are cultured for 24 hours and 
lysed with 60 /il of lysis buffer. A centrifugation step 

30 (2900 rpm for 5 minutes) is performed before the contents 
of the block are added to the primary filter plate. The 
optional step of adding isopropanol to TRIS buffer is not 
routinely performed. After the last step in the protocol, 
samples are transferred to a fieckman 96-well block for 

35 storage. 

Another new DNA purification system is the WIZARD™ 
product line which is available from Promega (catalog No. 
A7071) and may be adaptable to the 96-well format. 
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6 ,i4 >^ . SEOOENCINGi OF CDNA CLONES »' 

: > The cDNA inserts froin .>randoin isolates ..of: -^the Ur937- and 
THP-1 librariesiwere sequenced in* parti Methods for DNA 
sequencing are well known in the art. s.:Conventional ,r 
5 enzymatic; methods i.employ DNAv polymerase /Klenowr fragment, . 
Sequenase^^or.Taq. polymerase to.extend DNA chains from an 
oligonucletotide. primer annealed /to the ■ DNA template of 
interest. Methods ehave been .developed for the use of * both 
singler and : double-stranded ^templates. . The chain < - r 

10 termination reactioniproducts are usually electrophoresed > 
on urea-acrylamide gels and are detected reither by 
autoradiography (for radionuclide-;labeled precursors)- or cby 
fluorescence (for fluorescent-labeled precursors) . Recent 
improvements in mechanized ^ reaction preparation ,\ sequencing 

15 and analysis using. the fluorescent! detection method -have- 
permitted expansion in the number of. sequences that can be 
determined per -day,. (such as the : Applied Biosystems . 373 and 
377 DNA sequencer, Catalyst 800) . . Currently with the 
system as described, read lengths range from 250 to 400 

20 bases and are clone dependent. . Read length also varies^ 
with the length of time the gel is run. ; In general, the 
shorter runs tend to truncate the sequence.: A minimum of 
only about 25 to 50 bases is necessary to establish the 
identification and degree of homology of the sequence; 

25 Gene transcript imaging can be used with any sequence- 
specific method, including, but not limited to 
hybridization, mass spectroscopy, capillary electrophoresis 
and 505 gel electrophoresis. 

6, 5. HOMOLOGY SEARCHING OP cDNA CLONE AND 
30 DEDUCED PROTEIN (and Subsequent Steps) 

Using the nucleotide sequences derived from the cDNA 
clones as query sequences (sequences of a Sequence 
Listing) , databases containing previously identified 
sequences are searched for areas of homology (similarity) . 
35 Examples of such databases include Genbank and EMBL. We 
next describe examples of two homology search algorithms 
that can be used, and then describe the subsequent 
computer-implemented steps to be performed in accordance 
with preferred embodiments of the invention. 
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. in the following description of the coinputerr . - , . 
implemented steps of the dnvention, the vord i "library" 
denotes a- set (or population) of biological > specimen ^ : 
nucleic acid sequences. A: "library" .can consist of .cDNAx 
5 sequences , .UNA isequences, lor the like, rWhich characterize a 
biological specimen. The biological ; specimen can consist 
of ^^ells of a single human cell- type (or can be any of the 
other rabove-mentioned types; ipf /specimens) i rWe contemplate 
that the sequences in a library have .been determined- so as 
10 to accurately represent on characterize a] biological f^/ j} f > 
specimen, (for example, they can consist of • representative \ 
cDNA. sequences from clones of RNA taken from a single human 



cell) • V\ ' .J^.r . ^ , . . "i r" 



In the following description of,, the .cpmputer-r 
15 implemented steps . of the invention ? the expressi^on 

"database", denotes a set of i stored . data which -represent : a r 
collection of sequences, which in iturn represent at ^ 
collection of biological reference materials. For : example, 
a database . can consist of data representing many stored . 
20 cDNA sequences which are in turn representative ,of human 
cells infected with various viruses, cells of humans ^of 

various ages, cells from different mammalian species, and 

so ' on. ^ , - 

In preferred embodiments, the invention, employs; a 
25 computer programmed with software (to be described) for 
performing the following steps: 

(a) processing data indicative of a, library of cDNA 
secpiences (generated as a result of high- throughput cDNA 
sequencing or other method) to determine whether each 

30 sequence in the library matches a DNA sequence of a 

reference database of DNA sequences (and if so, identifying 
the reference database entry which matches the sequence and 
indicating the degree of match between the reference 
sequence and the library sequence) and assigning an 

35 identified sequence value based on the sequence annotation 
and degree of match to each of the sequences in the 
library; 

(b) for some or all entries of the database, 
tabulating the number of matching identified sequence 
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values in the library (Although this can be done by human 
hand ..from a printout :of all entries , ve rpref er to perform 
this step using computer software to be described below. )/ 
thereby generating a iset /of f inal data values or "abundance 



5 numbers" ; . and V. ; ' ^ ^ 



s (c) if the. libraries are different ^sizes/ dividing -r 
each abundance number by : the : total number of sequences in 
the library^ to obtain a relative abundance^number for each 
identified sequence value (i.e. , -a relative- abundance of - - 
10 each gene ^transcript)'; ^ c v^.'- ^/^ ,:\^^r : -.s^iii. 

The list of identified sequence' values ?(or genes . - 
corresponding ^thereto) cahi then be sorted by abundance in 
the cDNA population. . .A ^multitude of additional -types of 
comparisons or dimensions tare .possible. ;\ r 
15 . For /example (to be described \below > in greater detail)^' 
steps (a) and (b) can be repeated for two different > • ^ : . 
libraries (sometimes referred to as a "target" library and 
a "subtractant" library) . Then/ for each identified ^ 
sequence value (or gene transcript) , .a "ratio" value is r u 
20 obtained by dividing the abundance number (for-that 

identified sequence value):for the target library/ by the 
abundance number (for that identified sequence value) for 
the subtractant library . . . . . . 

In fact, subtraction may be carried but on multiple 
25 libraries. .It is possible to add the transcripts from ' 
several, libraries; (for example, three) and then to divide 
them by another set of transcripts from multiple libraries 
(again, for example, three) . Notation for this operation* 
may be abbreviated as (A+B+C) / (D+E+F) , where the capital 
30 letters each indicate an entire library. Optionally the 
abundance numbers of transcripts in the summed libraries 
may be divided by the total sample size before subtraction. 

Unlike standard hybridization technology which permits 
a. single subtraction of two libraries, once one has 
35. processed a set or library transcript sequences and stored 
them in the computer, any number of subtractions can be 
performed on the library. For example, by this method, 
ratio values can be obtained by dividing relative abundance 
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values in a first library by corresponding values in a 
^second library^ and vice .versa • - ^i; 

?In * variations on 1 step (a) , the library consists of ^ 'c 
nucleotide sequences derived from cDNA clones. -Examples of 
databases which" can be searched for areas' of homology -^ -^ 
(similarity) in step : (a): include the commercially available 
databases known as Genbank (NIH) EMBL (European Molecular 
-Biology Labs; Germany) , and GENESEQ (Intelligenetics, 
• Mountain, View /-California) - . 

;;10 ' i One homology search algorithm' which' can be used to 

^implement step: (a) is the algorithm described in the paper 
by D.J. iLipman and W.R; Pearson, entitled "Rapid and 
Sensitive .Protein Similarity Searches Science . 227 : 1435 
(1985). In this algorithm; the homologous regions are 
a5 searched in a two-step manner • ;In the first step, the 

highest homologous regions are * determined by calculating a 
matching score using a homology score table. The parameter 
"Ktup" is used in this step to establish' the minimum window 
size to be shifted for comparing two sequences. Ktup also 
20 sets the number of bases that must match to extract -the' 
highest homologous region among the sequences. In this 
step, no ; insertions or deletions are applied and the 
homology is displayed as an initial (INIT) value. 

In the second step, the homologous regions are aligned 
25 to obtain the highest matching score by inserting' a gap in 
order to add a probable deleted portion. The matching 
score obtained in the first step is recalculated using the 
homology score Table and the insertion score Table to an 
optimized (OPT) value in the final output. 
30 DNA homologies between two sequences can be examined 

graphically using the Harr method of constructing dot 
matrix homology plots (Needleman, S.B. and Wunsch, CO., 
Mom. Biol 48:443 (1970)). This method produces a 
two-dimensional plot which can be useful in determining 
35 regions of homology versus regions of repetition. 

However I in a class of preferred embodiments, step (a) 
is implemented by processing the library data in the 
commercially available computer program known as the 
INHERIT 670 Sequence Analysis System, available from 
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Applied Biosystems Inc. (Foster City.,. California) , 
.including, the .software iknown. as. the Pactura software (also 
available: froin Applied Biosystems »lnQ. )^. The ^Factura-.- 
progrjamr preprQcesses each library sequence to, "edit out". y 
5 portions .thereof whi^jh are not;likely to be of: interest,.- 
isuch as the vector used to prepare the : library.. -Additional 
sequences : which can be edited out or masked (ignored by the 
search tools) include but are not limited to the polyAwtail 
and repetitive CAG and CCC sequences,, A low-end search^ :< 
10 program can be .written to^ mask out such "low-rinformationV,> 
sequences , or programs .such as BLAST can ignore the low- 
linf ormat ion sequences i : ? . w j i , r i v . , ^ 

r. In the algorithm Implemented by the INHERIT 670 i./ 
Sequence Analysis System, . the Pattern Specification 
15 Language (developed by TRW Inc.) is used to determine- 
iregions of homology. "There are three parameters that 
determine liow INHERIT analysis runs sequence comparisons: 
window si2e> window offset and error tolerance. Window, , 
size specifies the length of the segments into which the, 
20 query sequence .is subdivided. Window of f set: specif ies 

where to start the next segment :tto be compared], counting 
from the beginning of the previous segment. Error 
tolerance specif ies the total number of insertions, . / 
deletions and/or substitutions that are, tolerated over the 
25 specified word length. Error tolerance may be set to any 
integer between 0 and 6. The default settings are window 
tolerance=20., window offset=l0 and error tolerance=3 . " 
INHERIT Analvsis Users Manual , pp. 2-15. Version 1,0, 
Applied Biosystems, Inc., October 1991. 
30 Using a combination of these three parameters, a 

database (such as a DNA database) can be searched for 
sequences containing regions of homology and the 
appropriate sequences are scored with an initial value. 
Subsequently, these homologous regions are examined using 
35 dot matrix homology plots to determine regions of homology 
versus regions of repetition. Smith-Waterman alignments 
can be used to display the results of the homology search. 
The. INHERIT software can be executed by a Sun computer 
system programmed with the UNIX operating system. 
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Search alternatives to INHERIT include the BLAST 
program; GCG. (available from the Genetics Computer Group, 
WI) and the vDasher program (Temple Smith, v Boston > i i. 
University/ .Boston, MA). Nucleotide sequences can be - 
5 searched against Gehbank, EMBLj or. custom databases such as 
GENESEQ (available, from Intelligisnetics, Mountain View, CA) 
or other databases for genes. , In addition, we have • 
searched some, :seguences against pur .own in-house^; database. 
■ In, preferred embodiments, . the. transcript sequences: are 

10 analyzed by the INHERIT ^ software for best , conformance with 
a reference gene.. transcript- to :assign a : sequence identifier 
and assigned the degree )Of homology, which together are the 
identified sequence value and are, input into , and ^ further : 
processed by, * a; Macintosh personal computer (available from 

15 Apple) programmed with- an "abundance; sort; and subtraction 
analysis", computer program {tcbe described below) . , 

Prior to. the abundance sort and subtraction analysis 
program (also denoted as the "abundance sort" ; program ) , - 
identified sequences from the cDNA. clones are assigned 

20 value (according to the parameters given above) by degree 
of match according to the following categories: "exact" 
matches (regions with a high degree of identity);, . ■ , 
homologous human matches (regions of high similarity, but 
hot "exact" matches) , homologous non-human matches (regions 

25 of high similarity present in species other than human) , or 
non matches (no significant regions of homology to 
previously "identified nucleotide sequences stored in the 
form of the database) . Alternately, the degree of match 
can be a numeric value as described below. 

30 With reference again to the step of identifying 

matches between reference sequences and database entries, 
protein and peptide sequences can be deduced from the 
nucleic acid sequences. Using the deduced polypeptide 
sequence, the match identification can be performed in a 

35 manner analogous to that done with cDNA sequences. A 

protein sequence is used as a query sequence and compared 
to the previously Identified sequences contained in a 
database such as the Swiss/Prot, PIR and the NBRF Protein 
database to find homologous proteins. These proteins are 
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initially scored for homology using a homology, score Table 
: (0^P^^%9: ^-9*. and Payoff ,..M.O..rrScorincr Matrice 

Report MAT - OZaS (February 1985,);) resulting tin,. an INIT 
^ -score., .. Tlie homologous regions are aligned vtO; .obtain .the 
5 iiighestj.matching scores by, insert!^ gap .which .adds rain: 
c;probable.,deleted portion. The matching score iIsm.,: a /r. . 
- irecalculated using the homology score .Table and the 
< insertion score .Table resulting in an . optimized :(OPT) 
, score.. Even In ithe .absence ofr knowledge of .the proper r 
.10 r treading frame of an ^isolated sequence^ the aboye-described 
. protein homology search may be performed by searching, all 3 
r: reading, frames.. ■ \,i v-:-- " : - ^-■■■^ 

rFeptide- and protein sequence homologies can also be 
ascertained, .using the INHERIT 670 Sequence Analysis System 
15 in an analogous way to that used in. DNA sequence. ^ 

homologies . Pattern Specification Language and parameter 
windows are. used to search rprotein databases for sequences 
containing regions of homology which are scored with an 
initial value. Subsequent display in a dot-matrix homology 
20 plot shows regions of. homology versus, regions of 

repetition. Additional search tools that are available to 
use on pattern search databases include PLsearch Blocks 
(available f rom. Henikof f & Henikoff, University of t 
Washington, Seattle), Dasher and GCG. Pattern search 
25 databases include, but are not limited to. Protein Blocks 
(available from Henikoff & Henikoff, University^ of . 
Washington, Seattle) , Brookhaven Protein (available from 
the Brookhaven National Laboratory, Brookhaven, MA), 
PROSITE (available from Amos Bairoch, University of Geneva, 
30 Switzerland) , ProDom (available from Temple Smith, Boston 
University) , and PROTEIN MOTIF FINGERPRINT (available from 
University of Leeds, United Kingdom) . 

The ABI Assembler application software, part of the 
INHERIT DNA analysis system (available from Applied 
35 Biosystems, Inc., Foster City, CA) , can be employed to 

create and manage sequence assembly projects by assembling 
data from selected sequence fragments into a larger 
sequence. The Assembler software combines two advanced 
computer technologies which maximize the ability to 
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assemble sequenced DNA f raginents.^intiovAss^inblages, a 
speciaT grouping -of data * where the relationships vbetween ^ 
sequences are shown i by graphic overlap, aldgnmentrand i: : 
statistical views • ^The > process -is based on the^ 
5 Meyers-^ececioglu rapdei of fragment assembly. '( INHERIT™ '^i' 
Assembler :;User'=s Manual, Applied Biosystems, Inc., Foster 
City, CA) , and uses' graph theory as the foundation of .a * 
very rigorous multiple- sequence alignment engine for 
assembling DNA sec[uence fragments^ Other' assembly! -programs 
10 that can be used include MEGALIGN:r( 

Inc., Madison, vWI), Dasher and STADEN (available from Roger 
Staden, Cambridge, England) . • * ?; . . .v . 

Next, with reference to Fig. 2, we describe in more 
detail the HalDundahce .sort": program •^which'iimpieinents^^^^ 
15 mentioned "step (b) "^to tabulate the number. of sequences of 
• the library which match each database entry (the "abundance 
number" for each database entry).- ,hf 

Fig. 2 is a flow chart ^of a preferred embodiment of 
the abundance sort program. A source code listing of this 
20 embodiment of the abundance sort program is; set v forth in 

Table 5. In the Table 5 implementation, -.the abundance sort 
program is written using the FoxBASE programming language 
commercially available from Microsoft Corporation. 
Although FoxBASE was the program chosen for the first 
25 iteration of this technology, it should not be considered 
limiting. Many other programming languages, Sybase being a 
particularly desirable alternative, can also be used, as 
will be obvious to one with ordinary skill in the art; The 
subroutine names specified in Fig. 2 correspond to 
30 subroutines listed in Table 5. 

With reference again to Fig. 2, the "Identified 
Sec[uences" are transcript sequences representing each 
sequence of the library and a corresponding identification 
of the database entry (if any) which it matches. In other 
35 words, the "Identified Sequences" are transcript sequences 
representing the output of above-discussed "step (a)." 

Fig. 3 is a block diagram of a system for implementing 
the invention. The Fig. 3 system includes library 
generation unit 2 which generates a library and asserts an 
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output : stream of 'transcript sequences indicative , of . the 
biological sequences .cbinprising the library. Prograimned 
processor 4 ireceives-the data stream, output. jfroHi .unit 2 and. 
processes ^this data in :accordance with above*?discussed : 
5 Vstep; (a)r*^ to :generate Ithe. Identified Sequences.: Processor 
4 / can -be - a » processor ^ programmed ; with the ^ commercially 
available computer program known as .the INHERIT 670 
Sequence Analysis System : and the . commercially available^ \. 
computer program known as !the Factura program . (both 
10 available from Applied Biosystems Inc.) and with the UNIX 
operating system. 

StillJwith xeferencexto Pig. .3, the. Identified: . r^, 
Sequences are loaded intb procisssor is which is programmed 
with the abundance sort program. Processor 6 generkte^ the 

15 Pinal Transcript sequences ihd^ in both Figs. 2 arid' 3. 

Fig. 4 shows a more detailed block diagram of a planned '^"^ 
relational computer systein, including various searching 
techniques which can be implemented, -ailong 'with ah" ' ■ - '''' 
assortment of databases to query against. - 

20 With reference to Fig. 2, the abundance sort program' 

first performs an operation known as "Tempnum" on the 
Identified Sequences, to discard all of the Identified ' 
Sequences except those which match database entries of 
selected types. For example, the Tempnum process can - 

25 select Identified Sequences which represent matches of the 
following types with database entries (see above for 
definition) : "exact" matches, human "homologous" matches, 
"other species" matches representing genes present in 
species other than human) , "no" matches (no significant 

30 regions of homology with database entries representing 
previously identified nucleotide sequences), "I" matches 
(Incyte for not previously known DNA sequences) , or "X" 
matches (matches ESTs in reference database) . This 
eliminates the U, S, M, V, A, R and D sequence (see Table 1 

35 for definitions) . 

The identified sequence values selected during the 
"Tempnum" process then undergo a further selection (weeding 
out) operation known as "Tempred." This operation can, for 
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example, discard all identified sequence values 
representing matches with selected database entries. * 

The identified sequence values selected during 'the 
"Tempred? procesis are then classified according to library, 
5 during the "Teinpdesig" operation ^ ^ It is contemplated that 
the "Identified ^Sequences" can represent sequences from a 
single library, or from two or more libraries.: ^ 

Consider first the case that the identified sequence 
values represent sequences from a single library. :'In this 

10 case, all the identified sequence values -determined during 
"Tempred" undergo sorting' in the "Templib" operation, 
further sorting in the "Libsort" operation, and finally - 
additional sorting in the "Temptarsortn. operation. For • 
example, these three sorting operations can sort the*'^ 

15 identified sequences in order of decreasing "abundance 
number" (to generate a list of decreasing abundance 
numbers, each abundance number corresponding to a unique 
identified sequence entry, or several lists of decreasing 
abundance numbers, with the abundance numbers in each list 

20 corresponding to database entries of a selected type) with 
redundancies eliminated from each sorted list. In this 
case, the operation identified as "Cruncher" can be 
bypassed, so that the "Final Data" values are the organized 
transcript sequences produced during the "Temptarsort"' 

25 operation. 

We next consider the case that the transcript^ 
sequences produced during the "Tempred" operation represent 
sequences from two libraries (which we will denote the 
"target" library and the "subtractant" library) . For 

30 example, the target library may consist of cDNA sequences 
from clones of a diseased cell, while the subtractant 
library may consist of cDNA sequences from clones of the 
diseased cell after treatment by exposure to a drug. For 
another example, the target library may consist of cDNA 

35 sequences from clones of a cell type from a young human, 

while the subtractant library may consist of cDNA sequences 
from clones of the same cell type from the same human at 
different ages. 
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In this- casey 'the **Teinpdesi^" bperation^^routes * all 
transcript sequences representing the target library for 



processing in accordance with "Templib" (arid then "Libsort" 
arid "Teniptarsort") / and rbiites all transcript seqiie 
5 represent xrig the siibtractant ^^^^^^ for processing in 

accordance Vitli "Tempsiib"^ "Subsort" arid ' ' ' 



"Tempsubsbrt") . For example/ the consecutive "Templib," 
"Libsort, " arid "Temptarsort"' sorting operations sort ' 
identified sequences from the target library in order of 

10 decreasing abundance number (to generate a list of 
deicreasirig abundance numbers, each abundance number 
corresponding to a database entry, or several lists of 
decreasing abundance numbers, with the abundance numbers in 
each list corresporiding to dataiDase entries of a selected* 

15 type) with redundancies" eliminated ¥rom each ' so^^ list. 
'The consecutive' "Tempsub, " "Subsort," and "Tempsubsort" ' 
sorting operations sort identified sequences from the 
subtractant library in order of decreasing abundance number 
(to generate a list of decreasing abundance numbers, each 

20 abundance number corresponding to a database entry, or* 
several lists of decreasing abundance numbers, with the 
abundance numbers in each list corresponding to database 
entries of a selected type) with redundancies^ eliminated 
from each sorted list. 

25 The transcript sequences output from the "Temptarsort" 

operation typically represent sorted lists from which a 
histogram could be generated in which position along one 
(e.g., horizontal) axis indicates abundance number (of 
target library sequences) , and position along another 

30 (e.g., vertical) axis indicates identified sequence value 
(e.g., human or non-human gene type). Similarly, the 
transcript sequences output from the "Tempsubsort" 
operation typically represent sorted lists from which a 
histogram could be generated in which position along one 

35 (e.g., horizontal) axis indicates abundance number (of 

subtractant library sequences) , and position along another 
(e.g., vertical) axis indicates identified sequence value 
(e.g., human or non-human gene type). 
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The transcript sequences sorted lists) output 'from 
the Tempsubsort and Temptarsort sorting 'operations are ' 
combined during the operation identified as ^•Cruncher;*" 
The "Cruncher" pirocess identifies pairs of corresponding 
5 target airid subtractarit aburidahce numbers (both representing 
the same identified sequence value), and divides ' one by the 
other t'6 generate a "ratio'' value' for each pair of 
corresponding abundance numbers; and thien - sorts the ratio 
values in order of decreasing ratio value/ The data output 
lb from the ' "Cruncher" operation (the' Pinal ^Transcript 

sequence in Fig. 2) is typically a sorted aist from which: a 
hiistogram could be generated in ^which' position along -one . 
axis indicates the size of a ratio vof abundance; numbers 
(for corresponding identified sequence values from target 
15 and subtractant libraries) and position along .another axis 
indicates identified sequence value (e.gr> gene type>. --^ 
Preferably, prior to* obtaining = ratio between the two 
library abundance values, the Cruncher operation also^ 
divides each ratio value by the total number of sequences ^ 
20 in one or both of the target and subtractant libraries. 

The resulting lists of "relative" ratio values generated by 
the Cruncher operation are useful for many medical, ^ 
scientific, and industrial applications. Also preferably, 
the output of the Cruncher operation is a set of lists; 
25 each list representing a sequence of decreasing ratio 
values for a different selected subset (e.g. protein 
family) of database entries. 

In one example, the abundance sort program of the 
invention tabulates for a library the numbers of mRNA 
30 transcripts corresponding to each gene identified in a 

database. These numbers are divided by the total number of 
clones sampled. The results of the division reflect the 
relative abundance of the mRNA transcripts in the cell type 
or tissue from which they were obtained, obtaining this 
35 final data set is referred to herein as "gene transcript 
image analysis." The resulting subtracted data show 
exactly what proteins and genes are upregulated and 
downregulated in highly detailed complexity. 
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- r Table , 2 is an abundance table listing -the various gene 
transcripts, in . an induced HUVEC library. ; The transcripts . 
are listed- in order of decreasing abundance. < This 
5 computerized soriting simplifies analysis.^ of ^ the tissue land 
speeds, identification; of significant .new proteins which are- 
specific to this , cell typei i This type of - endothelial cell 
lines tissues of the : cardiovascular system, -and / the more 
that is known, about : its .composition/oparticularly i:in 
10 response to activation/ 1 -the .more choices of protein i targets 
become available to .affect/ in treating; disorders of this 
tissue, such as ^the highly prevalent atherosclerosis. 

^•.7. : ;monocyte -cell and mast-cell cdna libraries i) 

Tables 3 and 4 .show- .truncated comparisons of two 

15 libraries. In Tables 3 and 4 the "normal monocytes" are 
the HMC-1 cells, and the. "actiyated. macrophages" are the 
THP-1 cells pretreated with PMA and activated with .LESi- . ^ : 
Table 3 lists in descending order of abundance the most 
abundant gene transcripts for both cell rtypes . With ^ only 

20 15 gene transcripts from each cell type, ; this table permits 
quick, qualitative comparison of the most common , 
transcripts. This abundance sort, with its convenient 
side-by-side display, provides an Immediately -useful 
research tool. In this example, this research tool 

25 discloses that 1) only one of the top 15 activated 
macrophage transcripts is found in the top 15 normal 
monocyte gene transcripts (poly A binding protein); and 2) 
a, new gene transcript (previously unreported in other 
databases) is relatively highly represented in activated 

30 macrophages but is not similarly prominent in normal 

macrophages. Such a research tool provides researchers 
with a short-cut to new proteins, such as receptors, cell- 
surface and intracellular signalling molecules, which can 
serve as drug targets in commercial drug screening 

35 programs. Such a tool could save considerable time over 
that consumed by a hit and miss discovery program aimed at 
identifying important proteins in and around cells, because 
those proteins carrying out everyday cellular functions and 
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represented as steady state mRNA "are quickly eliminated 
from further characterization. 

This illustrates how the gene transcript profiles 
change \^th altered ce^ Thos^^ skilled" ^iW '^^t 

5 the art ^^know that the' biochemical composition of • cells"^ also 
- changes with other functional changes such 'as cancer, 
including cancer's various- stages/ -and exposure to < 
toxicity. ■ A gene transcript subtraction prof ile * isuch as in 
Table 3 is'usef ul as a first screening tool for such gene 
10 expression and prdteiri- studiesV 
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6.8. SUBTRACTION ANALYSIS OF NORMAL MONOCYTB-CELL AND 
rACTIVATE D MONOCYTE CELL cDNA LIBRARIES 



Once. the. cDNA data are in the computer , the computer 
prpgrain >as disclosed .in Table. 5 was used, to obtain ratios 

15 of all, the gene-transcripts inthe two libraries discussed 
in, Example 6,7, and the gene transcripts were sorted by the 
descending values of their ratios. If a gene transcript is 
not represented in one library, that gene transcript's. 
abundance is unknown but appears to be less than 1. As an 

20 approximation — and to obtain a ratio, which would not be 
possible if the unrepresented gene were given an abundance 
of zero — genes which are: represented in only one of the 
two libraries are assigned an abundance of 1/2. Using 1/2 
for unrepresented clones increases the relative importance 

25 of "turned-on" and "turned-off" genes, whose products would 
be drug candidates. The resulting, print-out is called a., 
subtraction table and is an extremely valuable screening 
method, as is shown by the following data. 

Table 4 is a subtraction table, in which the normal 

30 monocyte library was electronically "subtracted" from the 
activated macrophage library. This table highlights most 
effectively the changes in abundance of the gene 
transcripts by activation of macrophages. Even among the 
first 20 gene transcripts listed, there are several unknown 

35 gene transcripts. Thus, electronic subtraction is a useful 
tool with which to assist researchers in identifying much 
more quickly the basic biochemical changes between two cell 
types. Such a tool can save universities and 
pharmaceutical companies which spend billions of dollars on 
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research valuable time and laboratory- resources at the 
early discovery stage and can speed up the drug development 
cycle:; , which in turn permits researchers to set up- drug ^ . 
screening, programs much earlier . Thus, . this r research tool 
5 provides a. way to get new drugs to the public faster <^:arid ^ 
morec 'economically* ;/ r\ .1 • • ■ . 

. ^ Also> such. a subtraction table cambe obtained for ^ 
patient diagnosis. ;.An individual patient sample (such as 
monocytes obtained .from a biopsy .or > blood sample) can be 
10 compared ^with . data provided herein )to; diagnose conditions - 
associated with- macrophage activation. ' . i 
. \. ,,Table^4 uncovered many new gene transcripts (labeled 
Incyte clones) . Note that many genes are turned on in the 
activated macrophage (i.e., the monocyte had a 0 in the 
15 bgfreq column) . This screening method is superior to other 
screening techniques, such as the western blot, which are 
incapable of uncovering such a multitude of discrete new 
gene transcripts. 

The subtraction-screening technique has also uncovered 
20 a high number of cancer gene transcripts (oncogenes rho, 
ETS2, rab-2 ras, YPTl-related, and acute myeloid leukemia 
mRNA) in the activated macrophage. These transcripts may 
be attributed to the use of immortalized cell lines and are 
inherently interesting for that reason. This screening 
25 technique offers a detailed picture of upregulated 

transcripts including oncogenes, which helps explain why 
anti-cancer drugs interfere with the patient's immunity 
mediated by activated macrophages. Armed with knowledge 
gained from this screening method, those skilled in the art 
30 can set up more targeted, more effective drug screening 
programs to identify drugs which are differentially 
effective against 1) both relevant cancers and activated 
macrophage conditions with the same gene transcript 
profile; 2) cancer alone; and 3) activated macrophage 
35 conditions. 

Smooth muscle senescent protein (22 kd) was 
upregulated in the activated macrophage, which indicates 
that it is a candidate to block in controlling 
inflammation. 
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6.9. SUBTRACTION ANALYSIS OF NORMAL LIVER CELLS AND 
HEPATITIS INFECTED LIVER CELL CDNA LIBRARIES 

In this example, rats are exposed to hepatitis virus 
and maintained in the colony until i:hey; show definite signs 
5 of hepatitis. ; Of the? rats diagnosed with iepatitis, one 
half of :the rats' are treated with ^a new anti-hepatitis 
agent (AHA). ; Liver ;samples are 1 obtained from all rats 
before exposure to; the hepatitis virus liapd at the end of 
AHA treatment or no treatment Th addition; liver samples 
10 can be obtained from rats with hepatitis just prior to AHA 
treatment. 

The liver tissue is treated as described in Examples 
6.2 and 6.3 to obtain mRNA and subsequently to sequence 
cDNA. The cDNA from each sample are processed and analyzed 

15 for abundance according to the computer prbgrsim in Table 5. 
The resulting gene transcript images of the icDNA provide 
detailed pictures of the baseline (control) for each animal 
and of the infected and/or treated state of the animals. 
cDNA data for a group of samples can be combined into a 

20 group summary gene transcript profile for all control 
samples, all samples from infected rats and all samples 
from AHA- treated rats. 

Subtractions are performed between appropriate 
individual libraries and the grouped libraries. For 

25 individual animals, control and post-study samples can be 
subtracted. Also, if samples are obtained before and after 
AHA treatment, that data from individual animals and 
treatment groups can be subtracted. In addition, the data 
for all control samples can be pooled and averaged. The 

30 control average can be subtracted from averages of both 
post-study AHA and post-study non-AHA cDNA samples. If 
pre- and post-treatment samples are available, pre- and 
post-treatment samples can be compared individually (or 
electronically averaged) and subtracted. 

35 These subtraction tables are used in two general ways. 

First, the differences are analyzed for gene transcripts 
which are associated with continuing hepatic deterioration 
or healing. The subtraction tables are tools to isolate 
the effects of the drug treatment from the underlying basic 

40 pathology of hepatitis. Because hepatitis affects many 
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parameters, additional liver toxicity has been difficult to 
detect with only blood tests for the usual enzymes. The 
gene transcript profile and subtraction provides a much 
more complex^biochemical picture which researchers have 

' ^ ' " ' ^ 

^ Sr/.jieeded to .analyze such difficult problems. 

Second, the subtraction tables provide a tool for 
' identifying clinical markers, individual proteins or other 
biochemical determinants which are used to predict and/or 
evaluate a clinical endpoint, such as disease, improvement 

10 due to the drug, and even additional pathology due to the 
drug. The subtractions-tables specifically *;^ genes 

. ; which ar^, turned on. ^)r,:.off. Thus, Itthe ■ sulstrai^^ tables 
P^Tpvide a first sd^rieferi for a, set of; gen^ transcript 
candidates for use as clinical markerSl Subsequently, 

,15 electronic subtractions of additional cell and tissue 

libraries reveal which, of the poten£ial markers are in fact 
found in different ceil , and tissue: iibi-ari'es." Candidate 
gene transcripts found in additional'- libraries are removed 
from the set of potential clinical markers:- Then, tests of 

20 blood or other relevant samples which are known to lack and 
have the relevant ' condition are compared to validate the 
selection of the clinical marker. Tn this method, the 
particular physiologic function of the protein transcript 
need not be determined to qualify the gene transcript as a 

25 clinical marker. . \ 

' ' ' ' 'It* 

6,10. ELECTRONIC NORTHERN BLOT 

One limitation of electronic subtraction is that it is 
difficult to compare more than a pair of images at once. 
Once particular individual gene products are identified as 

30 relevant to further study (via electronic subtraction or 
other methods) , it is useful to study the expression of 
single genes in a multitude of different tissues. In the 
lab, the technique of "Northern" blot hybridization is used 
for this purpose. In this technique, a single cDNA, or a 

35 probe corresponding thereto, is labeled and then hybridized 
against a blot containing RNA samples prepared from a 
multitude of tissues or cell types. Upon autoradiography, 
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the pattern of expression of that particular gene, one at a 
time, can be quant itated in all the included samples. 

In contrast, a further embodiment of this invention is 
the computerized form of this process, termed here 

.5 "electronic northern j blot. " In this variation^ -a .single 
gene is queried f or ^expression against^ a multitude of 
^prepared and sequenced' libraries present Iwithin the 
database. In this/ way, the pattern of expression of any 
single candidate gene can be examined instantaneously and 

10 effortlessly. More .candidate genes can thus be scanned, 
leading to more frequent and fruitfully relevant * ' * ' 
discoveries. The computer program included' as Table 5 
includes a program 'for performing this J function, and Table 
•6 -is a partial listing of entries of the database used in 

15 the electronic northern blot.analyslsi * ! : 



6.11. PHASE I CLINICAL TRIM.S 

Based on the establishment of safety and effectiveness 
in the above animal tests. Phase I clinical tests are 
undertaken. Normal patients are subjected to the usual 

20 preliminary clinical laboratory tests.' In/addition, 
appropriate specimens are taken and subjected to gene 
transcript analysis. Additional patient specimens are 
taken at predetermined intervals during the test. The 
specimens are subjected to gene transcript analysis as 

25 described above. In addition, the gene transcript changes 
noted in the earlier rat toxicity study are carefully 
evaluated as clinical markers in the followed patients, 
changes in the gene transcript analyses are evaluated as 
indicators of toxicity by correlation with clinical signs 

30 and symptoms and other laboratory results. In addition, 
subtraction is performed on individual patient specimens 
and on averaged patient specimens. The subtraction 
analysis highlights any toxicological changes in the 
treated patients. This is a highly refined determinant of 

35 toxicity. The subtraction method also annotates clinical 
. markers. Further subgroups can be analyzed by subtraction 
analysis, including, for example, l) segregation by 
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occurrence and type of adverse effect; and 2) segregation 
by dosage. ^ 



6,12, GENE TRA MSCRIPT IMAGING AMALYSIS IN CLINICAI. STUDIES 

A gene transcript imaging analysis (or multiple gene 
5 transcript imaging analyses) is a useful tool in- other 
; <i«5linical studies. For ekample; the?:^4i'fferences in gene 
f ; "transcript imaging 4naiyse^s before 'and after .treatment can 
^e assessed for patients; on placebc^ and;. drug" treatment. 
T^is method also effectively screens for cliifidcal markers 
la -^6 follow in clinical us^ of tlie liirug': 



r \' ' ' 



1 • ■ /" 



V6.13. COMPARATIVE GENE TRANSCRIRT ANALYSIS BETWEEN SP KffTBfl 

> ^ The subtraction m^ethod can "be tised ^tcT' screen. cDNA 

A^^J^ar^ies frpm diverse sources. -For exkmpiev th6; same cell 
types from different species can be,;^ compared Joy gene 
15 transcript analysis :to screen fp/ specific differences, 
such as in detoxification enzyme systems. Such testing 
. aids in the selection and validation of an animal model for 
the commercial purpose of drug screening or toxicological 
testing of drugs intended for human or animal use. When 
20 the comparison between' animals of different species is': 
shown in columns for each species, we refer 'to this as an 
interspecies comparison, or zoo blot. 

Embodiments of .this invention may employ databases 
such as those written using the FoxBASE programming 
25 language commercially available from Microsoft Corporation. 
Other embodiments :Of< the invention employ other databases, 
such as a random peptide database, a polymer database, a 
synthetic oligomer database, or a oligonucleotide database 
of the type described in U.S. = Patent 5,270,170, issued 
30 December 14, 1993 to Cull, et al., PCT International 

Application Publication No. WO 9322684, published November 
11, 1993, PCT International Application Publication No. WO 
9306121, published April 1, 1993, or PCT International 
Application Publication No. wo 9119818, published December 
35 26, 1991. These four references (whose text is 

incorporated herein by reference) include teaching which 
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may be applied in implementing such other embodiments of 
the present invention. 

All references referred to in the preceding text are 
hereby.. .expressly incorporated by reference herein. 
.15 ) ' Various modifications and variations of the described 
method and system of the invention will be apparent to 
those skilled in the art without departing from the scope 
and spirit of the invention. Although -the invention has 
, , been described in connection with specific prej!erred; ; ' 
10 embodiments, it should be^ understood that the invention as 
claimed should npt.be unduly limited to such specific, 

■ embodini^nts,.. '^.^;::.^:•';;v>;;.. r "i: r.' 
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TABLE .2 



Clone numbers 15000 through 20000 
Libraries: JIUVEC : . ./v^ .: 
Arranged , by ABUNDANCE 
TotaU. cl^snes 'analyzed: 5000 



319 .genes; for a total of 1713 Clones 
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^ • - * . 








r ■ r 
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I nuiiiher 


N 


w en wry 


1 


; 15365 


67 " 


* ' * [ HfiPPT.A 1 


2 


^ 15004 


65 - . 




3 


15638 


63 - - 


' * NeY015fi3ft 


4 


r 15390 


50 • 


' *! ^ MCYDlS3Qn 

\, ^ ' . . w A w ^ w w 7 U 


5 


•..15193 


47 r 


• no£ xox 


6 


v>15220- 


• s 47 


ppppT Q 


7 


15280' 


'-' 47 


Mpvm 53ftn 

n^xuxw^ou 


8 


^15583'^ 


*-*3'3 




9 


15662 ' 


' 31 




10 


' 15036 ^ 


' 3Q 




11 






nSSFlAR 


12 




3 3 


Movm en*!*! 
NCXUlbU^V 


13 






- < MCXUXdU j3 


14 


X 9 X 7 O 






15 


X^ WW7 


on - 




16 


15331 


1 Q 

x7 




17 


15263 

X^<£ V W 


1 Q 


*tr»vni coo 


18 


"15290 


x7 


^ Mr«vni CO OA 
rlCXUX3290 


19 


15350 

A V W 4^ W 


1ft * 
xo 


^ nL^xuxdjdU 


20 


15030 


17 

X / 


nwxvxOvlJU 


21 


15234 


17 

• ' X / 


nuxuxd^o4 


22 


15459 


^ ' 16 
xo 


' Hr*vni CA co' 

nuXUx3437 


23 


15353 


n.c 
X w 


nuxuxdJdJ 


24 


15378 


15 

X 9 




25 


15255 


. 14 

• . • X ^ 


nunxnxoH 


26 


15401 


14 

• x^ 




27 


15425 


14 * 




28 


18212 


14 


HUMTHYHA 


29 


18216 


* 14 


HSHRPl 


30 


15189 


13 


HS18D 


31 


15031 


12 


HUMFKBP 


32 


15306 


12 


HSH2AZ 


33 


15621 


12 


HUMLEC 


34 


15789 


11 


NCY0157fl9 


35 


? 16578 


11 


HSRPSll 


36 


16632 


11 


M61984 . 


37 


18314 


11 


NCY018314 


38 


15367 


10 


NCy015367 


39 


15415 


10 


HSIFNINl 


40 


15633 


10 


HSLDHAR 


41 


15813 


10 


CHKNMHCB 


42 


18210 


10 


NCY018210 


43 


18233 


10 


HSRPII140 


44 


18996 


10 


NCV018996 


45 


15088 


9 


HUMFEKL 


46 


15714 


9 


NCY015714 


47 


15720 


9 


NCY015720 


48 


15863 


9 


NCY015863 


49 


16121 


9 


HSET 


50 


18252 


9 


NCy018252 


51 


15351 


8 


HUMALBP 


52 


15370 


8 


NCY015370 



8 



descriptor 

Riboptn L41 
INCYTE 015004 
XNCYTE 015638 
INCYTE 015390 

Fibronectin 

Riboptn L9 
INCYTE 015280 

EST HHCH09 (ICR) 

Act in, gamma . 

INCYTE 015026 

Elf 1-alpha 

INCYTE 015027 

INCYTE 015033 

INCYTE 015198 

Collagenase 

INCYTE 015221 

INCYTE 015263. . 
* jENCYTE. 015290 

INCYTE . 015350 , 

INCYTE 015030 : ■ 

INCYTE 015234 ^ X- : ^ 

INCYTE 015459 

INCYTE 015353 ; . . 

Ptn, kinase inhib; 

Thymosin "beta-4 

Lipocortin iV ' 

Poly-A bp 

Thymosin, alpha 

Motility relat ptn; MRP-l;CD-9 
: Interferon indue ptn 1-8D 

FK506 bp 

Histone H2A 

Lectin, B-galbp, 14kDa 

INCYTE 015789 

Riboptn Sll 

EST HHCA13 (ICR) 

INCYTE 018314 

INCYTE 015367 

interferon indue mRNA 

Lactate dehydrogenase 

C Myosin heavy chain B 

INCYTE 018210 

RNA polymerase II 

INCYTE 018996 

Ferritin, light chain 

INCYTE 015714 

INCYTE 015720 

INCYTE 015863 

Endothelin 

INCYTE 018252 

Lipid bp, adipocyte 

INCYTE 015370 
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TABLE 2 Con't 
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HSRFL17 


58: . 
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60 . ' 


15245 


7 


NCY015245 


61 


15288 


7 


NCV015288 


62 


15294 - 


7 


HSGAPDR 


63 ' ' 


15442 " 


7 


HUMLAMB 


64.-^ 


15485- 


mm 


HSNGMKNA 


65 v : 


16646 : 


7 


NCY016646 


66 . 


18003 ' , 


7 


HUMPAIA 


67 


15032 ,r. 


6 


' * u HUMUB 


68 


15267 


6 


HSRPS8 


69 


15295 " 


5 


NCY015295 


70 


15458 


•6 ' 


RNRPSIOR 


71 15832 , 


6 


RSGALEM 


72 . 


.15928 




. . HUMAPO J 


73 


16598 


6 


HUMTBBM40 


74 


18218 r 


6 


NCY018218 


75 


18499 ' 


6 


HSP27 


76 
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6 


NCy018963 


77 


18997 


6 


NCy018997 


78 
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5 


HSAGALAR 


79 


15475 


5 


NCY015475 


80 , 
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5. 


NCY015721 


81 
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5 ' 


NCy015865 
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NCY016270 


83 
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5 


NCY016886 


84 
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5 ' 


NCY018500 


85 
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NCY018503 
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4 


HUMIFNWRS 


89 
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NCY015242 


90 


15249 
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NCY015249 


91 


15377 ' 
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NCY015377 
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15407 
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NCY015407 


93 


15473 


4 


NCY015473 


94 


15588 


4 


HSRPS12 


95 


15684 


4 


HSEFIG 


96 


15782 


4 


NCY015782 


97 


15916 - 


4. 


. HSRPS18 


98 


15930 


4 


NCy015930 


99 _ 


16108 


4 


NCY016108 


100 


16133 


4 


NCY016133 



r •* 



R 
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R 
F 



descriptor 

NADH'*ubig oxidoreductase 
INCYTE 015795 
INCYTE 016245 
INCYTE 018262 
Riboptn'L17' ' • 

Riboptn LI 
'Act in f beta 
INCYTE 015245 
INCYTE 015288 
G-3-PD 

Laminin receptor, 54kDa 
Uracil DNA glycosylase 
INCYTE 016646 
Plsmnogen activ gene 
Ubiquitin 
Riboptn S8 
INCYTE 015295 
Riboptn SIO 

UOP-galactose epimerase 
Apolipoptn J 
Tubulin, beta 
INCYTE 018218 
Hydrophobic ptn p27 
INCYTE 018963 
INCYTE 018997 
Galactosidase A, alpha 
INCYTE 015475 
INCYTE 015721 
INCYTE 015865 
INCYTE 016270 
INCYTE 016886 
INCYTE 018500 
INCYTE 018503 
Riboptn L34 
Riboptn LI a 
tRNA synthetase, trp 
INCYTE 015242 
INCYTE 015249 
INCYTE 015377 
INCYTE 015407 
INCYTE 015473 
Riboptn S12 
Elf l-gamma 
INCYTE 015782 
Riboptn S18 
INCYTE 015930 
INCYTE 016108 
INCYTE 016133 
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TABLE 4 



Libraries it' > THP-1 
Subtracting: HMC 
So^ed by. ABUNDANCE 
Total clbhes analyzed: 



1057 genes /. for , a total 

gentry ; . ;s 



niunber ^ 



7375 

of .2151 clones 
descriptor / . 



bgfreg rfend ratio 



10022 

10036 

10089 

10060 

10003 

10689 

11050 

10937 

10176 

10886 

10186 

10967 

11353 

10298 

10215 

10276. 

10488 

11138 

10037 

10840 

10672 

12837 

10001 

10005 

10294 

10297 

10403 

10699 

10966 

12092 

12549 

10691 

12106 

10194 

10479 . 

10031 

10203 

10288 

10372 

10471 

10484 

10859 

10890 

11511 

11868 

12820 

10133 

10516 

11063 

11140 

10788 

10033 

10035 

10084 

10236 

10383 



HUMILr 
: -HSMDNCF 

^Ihslagicdn 

l^.HUMTCSM 

HUMMIPIA* 
HSOP 

NCY011050 
HSTNPR 
flSSOD ■ 
HSCDW40 
HUHAPR 
HUMGDN 
NCY011353 
NCY010298 
^UM4COLA 
NCy010276 
NCy010488 
NCY01113B 
HUMCAPPRO 
HUMADCY . 
HSCD44E 
HUMCYCLOX 
NCYOlOOOl 
NCY010005 
NCy010294 
NCY010297 
NCY010403 
NCY010699 
;NCY010966 
'NCY012b92 
HSRHOB 
HUMARFIBA 
HSADSS 
HSCATHL 
CLMCYCA I 
NCY010031 
. NCY010203 
NCY010288 
NCy010372 
NCY010471 
NCy010484 
NCY010859 
NCY010890 
NCY011511 
NCY011868 
NCY012820 
HSIIRAP 
HUMP2A 
HUMB94 
HSHB15RNA 
NCY001713 
NCY010033 
NCy010035 
NCY010084 
NCY010236 
NCY010383 
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^0 


262.00 


IL-8 


0 


119 
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Lymphocyte activ gene 
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71 
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Collaaenase. tvoe IV 




£ 


1 o nnn 

• UUU 


INCYTE 010276 
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1 o nnn 

X^ • UUU 


INCYTE 010488 
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1 0 nnn 

X^ • UUU 


INCYTE 011138 
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1 o nnn 
X£ • UUU 


Adenylate cyclase 


1 
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1 n nnn 

XU • UUU 


Adenylate cyclase 


0 


c 


1 n nnn 

XU « WUU 


Cell adhesion glptn 


0 


c 


1 n nnn 

Xw • UUU 


Cyclooxygena5e-2 


0 


C 


1 n nnn 

XU • UUU 


INCYTE 010001 


0 




1 o nnn 

XU • UUU 


INCYTE 010005 


0 


5 


in nnn 

Xw • wUU 


INCYTE 010294 
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5 


in nnn 

Xw • Www 


INCYTE 010297 


0 


5 


in nnn 

JLU • UUU 


INCYTE 010403 
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5 


10 - 000 

X w • www 


INCYTE 010699 
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5 


10 « 000 


INCYTE 010966 


0 


5 


10.000 


INCYTE 012092 


0 


5 


10.000 


Oncogene rho 


0 


5 


10.000 


ADP-ribosylation fctr 


0 


4 


8.000 


Adenylosuccinate synthetase 


0 


4 


8.000 


Cathepsin L 


0 


4 


8.000 


Cyclin A 


0 


4 


8.000 


INCYTE 010031 


0 


4 


8.000 


INCYTE 010203 


0 


4 . 


8.000 


INCYTE 010288 


0 


4 


8.000 


INCYTE 010372 


0 


4 


8.000 


INCYTE 010471 


0 


4 


8.000 


INCYTE 010484 


0 


4 


8.000 


INCYTE 010859 


0 


4 


8.000 


INCYTE 010890 


0 


4 


8.000 


INCYTE 011511 


0 


4 


8.000 


INCYTE 011868 


0 


4 


8.000 


INCYTE 012820 


0 


4 


8.000 


IL-1 antagonist 


0 


4 


8.000 


Phosphatase, regul 2A 


0 


4 


8.000 


TNF-induc response 


0 


4 


8.000 


HB15 gene; new Ig 


0 


3 


6.000 


INCYTE 001713 


0 


3 


6.000 


INCYTE 010033 


0 


3 


6.000 


INCYTE 010035 


0 


3 


6.000 


INCYTE 010084 


0 


3 


6.000 


INCYTE 010236 


0 


3 


6.000 


INCYTE 010383 


0 


3 


6.000 
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• ' " ' TABLE 4 Con*t 



number'. entry^,., .s descriptor bgfreq rfend ratio 



10450 


NCy010450 


' iNCyTE 


010450 


0 


3 


6.000 


10470 . . , - 


.;.NCy010470 


INCyTE 


010470 


0 


3 


6.000 


10504 . 


: :NCy010504 


INCyTE 


010504 


0 


3 


6.000 


10507 - 


WCy010507 1 


--INCyTE. \010507r>w r-^'-\ 






%':6* 000 


10598 


NCY010598 


IMCYTE 




0 


3 


6.000 


10779 


NCy010779 


TNCYTE 


01077Q 


0 


3 


6.000 


10909 


NCy010909 


INCYTE 




0 


3 


6.000 


10976 


NCy010976 


TNCYTE 




0 


3 


6.000 


10985^ 


NCy010985 


INCYTE 


010985 

V V 7 U *J 


0 


3 


6.000 


1105i2: . 


NCy011052 


INCyTE 


011052 


0 


3 


6.000 


11068 


NCy011068 


INCyTE 


011068 


0 


3 


6.000 


11134. ' ; 


.NCy011134 


INCYTE 


011134 


0 


3 


6.000 


11136 


NCy011136 


INCYTE 


011136 


0 


3 


6.000 


11191 


* NCY011191 


INCYTE 


011191 


0 


3 


6.000 


11219 • 


^ 'NCy011219 


INCYTE 


011219 


0 


3 


6.000 


11386*^'"'' 


. NCY011386 


INCYTE 


011386 


0 


3 


6.000 


11403 / 


NCy011403 


INCYTE 


011403 


0 


3 


6.000 


11460 


.NCy011460 


INCYTE 


011460 


0 


3 


6.000 


11618 , 


NCy011618 


INCYTE 


011618 


0 


3 


6.000 


11686 


NCy011686 


INCYTE 


011686 


0 


3 


6.000 


12021 


NCy012021 


INCYTE 


012021 


0 


3 


6.000 


12025 


NCy012025 


INCYTE 


012025 


0 


3 


6.000 


12320 


NCy012320 


INCYTE 


012320 


0 


3 


6.000 


12330 


NCy012330 


INCYTE 


012330 


0 


3 


6.000 


12853 


NCy012853 


INCYTE 


012853 


0 


3 


6.000 


14386 


NCy014386 


INCYTE 


014386 


0 


3 


6.000 


14391 


NCY014391 


INCYTE 


014391 


0 


3 


6.000 
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;'^.^]T-' ' TABLE 5 



'r . / 
»' \ 



•: Master nemi for BOSTRACTION oi;cput 
SBP EXACT ON - ^ ^ v : i 



SET DSi^XCS TO SCREEN 

USB-*anartGuytF09<BAS5f/Uaesf0K files i Clones. dbf 
QOTOP ' ^ 
filOftS ITOdBBR TO JQHTIAIE 
60 BOT TO M^- ■ / 
STOPS KDHBBR TO'TlRMDaSS 

ffSOBB - , .'TO Targets . - 

grORB ;* .: : • TO Objectl 

fiTOftB • ' : -» r. V , ■ TO Object2 ^ ■ ■ 3 -.^ ' 

STORB ' ' TO Ob3ect3 

gTORB .0. TO AMAL * 

6TCB B -O 'TO Bg gCT . 

g^OR E 0 TO HH ftJCT 

STOgg :0 ..TO ONA.TCR 

grORE 'O .TO IMftTCH^,,vr. ^ 

STORE 0 TO JPW V . ' - J.} ■ ,\- \ i,,y!.\-: 

tfiUHlS 1 TO BAXI/ ^ 
DO KRZZ£ .T, * 

• "Prograffl. i;flubtractiofl 2.£nit - > - . ' ^ , . 

• Itotea.^/. t PoxnBt' file Subtraction's ;^ '...■■>' , 

5CREQ7 1 T77E 0 KEADDIS *Screen 1* AT 40,2 filZE 286,4S2 PIXELS PONT *QttWva'j9 COLOfl 0*0.0. 
a mSLS 75,120 TO 178,241 STSaS 3871 OOjOR 0,0, -1,24610. -1,8947 ' . ' ' ' ' 

6 PIXELS 27,134 SAIT ^Subtraction Menu*' filYIiE (5536 FGNT *Cenevtt*,274 CO(LDR 0,0,-1.-1,-1,-1 
0 PIXELS. 117,126 ,GSX aOTCH fiHl^ 65536 .PONT •ChicagoM2 PICTORB 'Q*C 'Stfict* • SIZB'lSiSa "00 
e'PlXSLS 135,.126 GET HM&TCH STitSB 65536 PONT •Chicago", 12 .PICTORB ■C*C Bcnologous*- grgiw XS,1 
e PIXEt.5 153,126 .GET dOTOK STJTLE 65536 FOtn *ChicacroM2 PICTORE *9*C Other spG* SIZE 15,84 
8 PIXELS ,90,132 SMT . -Ifetchefl t". BTSfLE 65535 PONT *CenevaM2 COLOR 0,0,-1,-1,-1,-1^ 
€ PIXELS 171,126 :get Imatch STSflfi 65536 POWT •ChicagoM3 PICTORB "A'C'lncyte' SIZE -15,65 CO 
6 PIXELS 252,137 CTa" initiate. STV1£ 0 PtWT 'Oeneva*,12 SIZE 15,70 COEOR 0,Or»l, -1,-1,-1 
8 PIXELS 252,236 GET terminate STYLE 0 PONT •Cchcva'',12 SIZE 15,70 COIOT 0,0,-1,-1,-1,-1 
O..PXXELS 252,35 SA2r 'Zaclxade clones - ^ sm£ 6S536 FOOT *Geneva",12 CQLOR 0, 0» -1,-1.-1.-1 
0 ifDCELS. 252,215 SKC ■->• STYLE* 6S536 FONT 'GenevaN W COLOR 0,0,-1,-1, -1, -1 . . ' * ' * 
8 PIXELS 198,126 GET OT STSOB 65536 FCOT •Chiciagor,13 PICTURE "8*0 .Print CO file" SIZE 15'.9 
8'PIXSLS 90,9 TO 181,109 STSIfi 3871 COLOR 0; 0,-1, -256 00, -1,-1; . . " T**^ , 

a PIXELS 90,988 TO*1B1.397 ffr^iS 3871 COLOR 0,0,-1,-25600,-1,-1 ' ' ' 

6 PPCBL S 81,296 SAT ^Bedcground: * STSUi 65536 FONT *Genevar,270 COLOR 0,0,-1.-1,-1,-1 
8 PIXELS 45,135 GET AMMj STVIC 65536 FGNT 'C2iicago" ,12 PICTURE "8^R Overall iPvmcticn**5£ZB 4 
8 PIXELS 81,^6 6Ay 'Targets' GTVLS 65536 FONT 'GsRGva',270 COLOR 0,0, -1,-1 
8 PIXELS 108,20 GET targetl STSLE 0 POMT "GeDeva**,9 SIZE 12,79 COLOR 0,0, -^1,-1, -1,-1 
•8 PIXELS 135,20 GET target2 ffKLB 0 Ftair 'Geneva", 9 SIZE 12,79 COLOR 0,0,-1,-1,-1,-1 
.8 PIXBL5 162,20 GET targets STYtB 0 FObTF "0eneva\9 SIZE 12,79 COLOR 0,0,-1,-1,-1,-1 
8 PIXELS 108,299 GET Objectl 6WI£ 0 PONT 'Gen€Pva»,9 SIZE 12.79.CaLaR 0,0,-1,-1,-1,-1 
8 PIXELS 135,299 GET oibject2 S^fLS 0. FONT 'Geneva', 9 SIZE 12,79 COLOR 0,0,<*1, -1,-1,-1 
8 PIXELS 162,299 GST C(bject3 STVLE 0 FtXTT "Geneva", 9 SIZE 12,79 COLOR 0,O,-l,-l-,-l,-l 
'8 PIXELS 276,324*GET Bail STVI£ 65536 FONT "Chicago", 12 PICTURE <8*R Ru»;Bail out' SIZE 4112 

• EOF* Subtraction. 2. £mt , . v 



IF BailB2 
CLEAR 

CLOSE DftTS^BASES 

USB ' anart GuyiFoxBftSS^/ttacifox files i clones, dbf" 
.SET SAFET7 W 
gCRSE2 <. 1 OFF 
RETORT 
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STORE OTPBRtTargcUl-TO.Targetl^": VM': .^'r^':!^i:; - ^ * ' 

SaOfiB, 0PgER .(Tarsret2) TO T^geta - ' ' ' w^-i"* K.' 

GTOStS t2FPGR(l)argeb3) TO l^rgebS 

STORE OTPER'<0bjGatl)- ,T9 Gbjectl' ' ' ' 'v^-^^ -.-Ju C^Jmi 

fiTOIffi.IJPPER(C&jecta}: t0'0bj«Ct2 , . . , . , . 

STORE- OTrai<oaect3);,-TO;(&j^^^ -;^..c>. j-.r:^".r-/ ^ 



GO DUTIAIB . 

CWYJJBST GAP.^IELDS }iJa»3ER,lib3C63^^^ TZKFKQM 
uSS TSMFMUM ' 
CODZfT TO TOT- 

COW TO TMRED PGR fc''ESOR;rfe^'^^^^ ' - • * . • ;^ 

USB 7EHFBSD 



COPY b ljjjJC OT RE TO TEMPPBSIG - --i^i'^',- ""^ ""J v-v ^ : ■ 

X T Sn atdhpl ' ' • " ' v^-^.r. ^ Lx;,-' . : e;v ^.;^ 

AFPBa3 FROM FOR O^^B* 

APggN O PROW.rogMCW FOR Pg'HV 
I P tto fttChsl ' " . 

Z g Iff fltehal 

APPEND FROM TEMBMM.FOR Da^I'.OR.Di'X* ' ' ' ' '^ - -v-' • - i^O;/,}; 

1"'.-*^ "tiV ■ ■ 



COQHT TO SCARTOT 



OOP? STRDCTORE TO TEMPLIB 
•USB TEHPZjZB • * 

APPEND FROK ^EMPDESIQ FOR litoarymUPPBRttargetl} 

APPQiD^PROM -TEMPDESld FOR libratynUPPSR (targets ) 
ENDXF , ' 

IF targeWo' '« ■ ' • . '7':'; 

APPEfc® FROM. TEKPnES3CG FOR lihrary«OPPER (targets ) ^ " 

SXDIP 

CXJOTT TO .ANAUrOT* = ^ 

USE TOgffPZSIG . Z^ . . - ' 

copy STRUCTURE TO lEHPSOB 
USB TEMPSUB . . 

APPEND FROM tBKPDBSIG FOR XifarBzyaUPPER(Objectl) 
IF ta^;oet2o' . • 

APPS21D FROM 7^E21PDESIG FDR liteary«UFPER(Cabject3) 
IMDIF 

I F ta rget3o' 

'APP£31D FRO» mSFDBBJO FOR IdJaonuysUPPERCObdecta) 
_ g^ IP 

OOW T TO SOBT RACTOT 
SST TOLK OPP 

^ CGKFRSSSIGN SUBR0C7IZNE A * 
? *CONPB£SSX2{0' QOERY ZiIBRARY' 
USE TEl^PUB 
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!,.■ •r.,...r-. .,it.:y, , .. , V..-..M ..-..■;,•:•.>..■■•... . ,v-.,... 



USE'IiZBSORr 



RBPIACB' KUj t'BPEWD ; WTO' 1 ^ • • . • : ; t j: .^-.*; ^7 . - - . r - 
MXm el 

DO WRZlfi 8W2«Q tBOLL 



f 1 



COUNT TO ADKZQUE 
LOOP 



'.•ir 



ffTORE aTTTCf 70 TBS3S . . - • . - . 

STOIB D TO.^CSSIOB ' 

IF TESTA e TSSTB.X^C},D^6XGABn^S2GB 

LOOP ' 
GO.'MRRKl - ^ • . 

WWKl • IdftRrtt+WP '* ' 

SWsl . . ... 

LOOP • ■ ^ 

BJPDO ROLL 

fiOKT Ca ? RgPjjD/D iHUHBSR TO TM?35aRS0RT. 

^REPLACE A££i START WZ3B RPSXtD/ZOSEKS'lOOOO 
COCffTF TO • VEMPOSARCO 

* cat^^^sicsnt suBRomxKB b 

? 'OCt CT^S IW .TftBGET LIBRftRy 
TOE .TEM WDB ' - , 
6QHT ON SZ<7Z^,}|DHBER TO'SUBSORT 
USE dUBSORT 

COCM T TO SDBQQIE 

KEPIACE ALL RF£2iD KITH I 
KKOa B 1 
SH2-0 . 

SO WHILE SW2eO ROLL 
IF KARKl >a SUBGENE 
PACK . 

CKIKT TO BUNIQUE 

SW2al 

LOOP • 

SNQDZF 
GO ISARKl • 
CUP - 1 

STORE, BNIRST TO ^9IA 
STORE D TO DSSXQA 

S» B 0 

CO KRZLB *6WbO TEST 



STORE £I7TRY TO. TGSTO 
STORE P TO D6SZ0B 

IF TESTA s TESTS .A2®«CESXGA?DBSX68 



.1 H 



t m 

CUP 9 1 . 

eroRE..arCOT .TO TESTA, - . ; - . - ; . ' . r • • . 

STORE D TD^DBSIGA . ^ ' 
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CO? ■ CCIP^I. 
LOOP - 

OO HARSCl ^ 

RBPIACB INTEND VITTH OOP 

LOOP'' V 

JJOOP^;''"^ ■ 
BTOQ RO IIi V 

gOR T ON BFEMD/DiNQKBER 00 VSKPSUBSOR? 

*REPIAC8 ALL 8gftRT Wira itPEro/IIgMg*10 
ODQETT^ID 1BG50BC0 » v- e, 



? 'fitfflTRftCiraO LIBRARIES oi v-. 
DSB ^feMPSDBSORT > 



1 



APPEND .PS(C0i:>nEKraiRS3RT V 
OCUNT 10 BAILOUT "wh 

MARK «.0'' ■ • ^ r ' \ , 

,-•'•!. ' . V » .. ^ w . 

' - 'i- ■» 

insTiBcr 1 * 

HMX m MARR+1 
IF IffiSUOBMLOOT 



gTC gS^E WnOf lO SOVKNER 
5SLSCT.2 

ICCATE. TOR SnPVdSCANNER 
IP gOO NDl) 
tfitWfc! RPBKD 70 Bin 
STORE KPZID TO 33XT2 



sgoRs 1/2 TO &m ^ 

B TOjg 0 TO BI^ 
SNDIF 

SKTTO 1 ' 

KZPIACE SGFRBO WXIH BIT2 
PSPtACE ACTUAL WHH Bin 



BNCDO 



f * 



c, 

:-7 



■ 1 ■ 



ItBPCACS ALL BATGtO WITO KFEKD/ACXOAL 

7 'OO IRa PINAL eOBT SY RATIO' 
.SQRP OM,RATIO/D,BGFraQ/0,rESCRI7IOR TO mOL 
USB FIMAL 



■et talk off 

SO CASE. 

CASS PTPsO* * 

SET DEVICE TO PRINT 

6Er PHIOT OK 



Q3 E P'n^l 

esr ALTER2ZATE TO "Adenoid .Patent Pigv&rass Subtraction. 
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■ S^gOgS FZNTZKE BISJnTHE.TO C3QUPSSC 
0^StB GCKPSEC/60 TO OOKIMZN 

EST MAB6IN TO 10 

01,1 GAY /LlfatBzy Subtrkotion Analysis' GTn£ ^536 FOnT "Geneva*, 274 COLOR 0« 0,0,-1, 
7 

7;data(}T.-'/' * "' ' ^ ' 

?? ' • ' 

9? TZUSO , 

7 . *CXoae aunOserfi * 

' 77.*<SIR(2Xn7IATEr 5, 0) 
.7? .V through V v ' 
77 6IH(77BMIKAT5;6,0} 
7 >IiibxarL«8t * 
7 o&rgetl 
IP nurgetSo' 
77. »,*• 

77 Target2 v 
IP larg^tSo'. ' 

77-', • • ^ 

77 Targets ' ' • ' 

fi&SDI? 

7 'Subtraetizag; 
7 Objectl 
lP-Ctoject2o' 
77-^', . • 
77 ObjeotS 

EKDIF 

IF Objeob3o' 
77 » ■ 
?t Objects 



•7 'Dftsigsatifisisr .* 

IF EnatchpO .AND. ttnateh=0 .AKD. CnatchsO' .AKD. IMATCH=0 
77 'All' 

IF Ztatchal 
?? 'Bcaet, • 



IF Itoatchsl 
77 'Hunan,' 



'ZF Qnatdhal 
77 'Other cp.* 
W3ST 

IF Snatch"! 
77 »1HOTE' 



•IP A2IAL=1 

7 'Sorted ABUKnAKCS>- 

srayzp. 

IF ANAL^ 

7 'Arranged ty Fl»7cnON< 
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7 *%tal alones r^prii'sentedi * 
77 £7Tl(10r#5rO) 
7 'Total ' Clong a analysedt ' 
7? £m(GTA!CrOT,5fO) 

? '^bal.Gcrputfttion. timet 

77 * minutoa' ' 
? ' 

* location, r i function b « vp«ci«s 1 m lnt« 

***«*«« 

. BCREE27 1 TOPE 0 HEADING 'Sozeen 1* AT 40,2 'SIZE 2^$, 492 PTXSUB FONT *GenevaS9 COLOR 0 0 0 
" DO CASB . , f^j^i 

. : CASE ANALal 

L/7? fiTRCMlNXaupr4|0) 

:, '?7 V geoefl,. for-a total of ' . , 

• blones' ' ■ ' " ' ^- ' * " 

7 . * - . - . 

SCRSENl WPE 6 HEAUna -Scracn !■ AT 40,3 SIZE 286,492 PIXELS TOOT •Geneva", 7 asuOR 0.0.0. 
- list COT flalds nuniber,0,r,Z,RjaTOlf,S,CEeCKPTO!l,BQPREQrR^^ 

SET PRZZTX'.G^ ... 
• CLOSE &ASABASSS . 

~ USE.^esartOuyrFooCBASBt/Mactfax files > clones, dbf* 

' CASE.AMALC2 
- * arr an g e / function 

SET PAINT ' 

SCREEN ! WPB.0 HERD1M5 'Screen I'.AT 40,2 SIZE 286,492 PIXELS ' FOWT "Helvetica*, 268 COLOR 0 

'7 * BSTDIMQ FRCTSIHS' 

? 

;BCRETO 1 WPE 0 HpOWCWS -Screen If 'AT 40,2 SIZE 286,492 PIXELS PQWr •HQlvetioa-,265 COLOR 0 

7 'sur iaca nolecules end receptors i ' w , -m^i wwa v 

'?9"gP? 1 TOPE 0 HEftDIMS -Screen !• AT 40,2 SIZE 286,492 Pim,S FONT •Geneva". 7 COLOR 0.0,0. 
lifit OPP, Cielda rMmber,p,P;z,R,EN^ KJR Rb'B' . 

'SC^l TXPE 0 HE&DINO JScreen 1»'aT 40,3 SIZE 286,492 PIffiLS .FONT •Helvetica -.2 65 COUJR 0 
? 'Caleium-blndlng proteins I • ^ ' . . T^*^. ,r 

/SOIEEIJI WPE 0 •Screen 1- AT 40,2'SIZE- 286; 492 PIXELS KWT -Geneva*, 7 COLOR 0,0,0. 

list OFF fields n«ntoer,D,r#Z,R,EJnTOf,S,llBScRlPTOR,BSF»BQ TOR Rs'C 

: * f=^^JJ>f^^^^ rams rowr •KeivBtica-,265 cra^'o 

SCRBENl TSfPE 0 KEtoHOG -Screen !■ AT.4b,2 SIZE 286,492 PIXELS TOOT -Geneva-, 7 C0ii3R 0^6.0. 
. list OFF fields nuaber,D^p,z,R,EMiro,s,ISSCRIPlOR,BSFR^ TOR pa'S' 



I * r ' 



,fic^ 1 jfre 0 ffiftDiw -Screen 1' At:40,2 SIZE 286,493 PDcas FCWT -Heivetica-,265 ccibR' 6 
7 jpti ier Pinning protoingt ' 

gg**^ 3> TOPE-0 HSMSCNG -Screen f AT*40,2 SIZE 286,492 PIXELS TONT -Geneva •, 7 COLOR 0.0.0. 
list OFF fields- nunibBr,D,F,Z,R,E3miy,S,lffiSCRIPT0R,BGFR^ FOR Ra^I*^^ ' 

7 , . *' • • ," ■ * ' . . • ^ 

SCREEN 1,T»E 0 HEADING -Screen 1- AT 40,2 SIZE 286,492 PIXELS FOOT -Helvetica- ,266 OOLCR 0 

7 - . GNC0GUNE9' 

7 • 

8CRBS1 1 WPE 0 HERDDW •Screen 1- AT 40(2 SIZE 286,492 PIXELS FOOT '•Helvetica-, 265 COLOR 0 
7 'General oaeogeneai ' ^ • ' v 



SCRB^ OTPE O.KBADIMG -Screen 1- AT .40,2 SIZE 286,492 PIXELS . POTT -Geneva- ,7 COLOR 0,0.0, 
list OFF fields «Mber,D>F,Z,R,BNtRV,S, DESCRIPTOR, BSFREQ,RFEMD, RATIO, I FOR Ro'O' 

5°*™' J.*??^ ^ KEADINS -Screen l* AT 40,2 SIZE 286,492 PIXELS FOOT -Helvetica •,265 COLOR 0 
7 'OTP-binding ptobeinsi* . . . ' " ^•^'^ « 

SOffiani WPE 0 KEADOJO -screen I- AT 40,2 SIZE 286,492 PIXELS FOOT •Geneva-',7 COLOR o;o,0, 
list OFF fields nuariDer,D,F,Z,R,E?TRV,S,DESCRIPTOR,BSPRBQ,RPEND,RAT10,I FOR Ra*0' 
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f^«i S^t^°" '^'^ ^' -"."'^ ae«,«2 PIXELS ram -toiv^tica-.aes color o 

SCREEMl WPB 0 HSADZII3 "Screen l- AT 40,2 SIZE 386,493 PIXELS raw '£S£SS^ 7 r«fi^fi» «' 

Ust OPT fields Imba,D,7,^,T^,wm,s,BBsanntl^«^^ 

• • • • 

SCRSpi 1 TVPE 0 HEannc •Scroon !• » 40,2 SIZB aa6',492 mSLS MOT •HdlvBtlea^ ait« /v*T« « 

? • 'Kin aaea rod Phoaphataaegt ' . ^-^as^ twr HdivBtica',2W OGttH o 

SaipMl'TCTB 0 HEADIMC -Screen !• M 40.2 SIZE 286,493 PIXSLS TOOT •GenevaV? OSLOr'o n ft 
...list OPF.fislds ntImber,D,P,2,R,E^^tW,SpCESCRlPTOR,B0PR^ 

' . SCREEN I'TVPE 0 HEADBC -Scarten AT 40,3 SIZE 286,492 PIXELS POW •Helvetica- 2tf«; rr*r/» n 

SCRE^ Wra 0 HE&DIKG ."Screeii 1« M 40,2 SIZE 286,492 PIXSW PCNT •Geneva* 7 ccjlob o a n 
-list OFF '£i«lds nuinber,p,F,2,R,amtf,S,nESCiaPI0R,BC^ 

7. - " - ^ ' ' 

SCRSffll 1 TOPB 0 HEMJDa «9c areBn !■ AT 40.2 SIZE 28^,492 PIXELS KIOT 'MelVBtica' 2fiB cnr/w n 



BCREBI 1 WPB 0 HEADIN3 •Screen l* AT 40,2 S123 286.'49a PlXaS PCMP •HelvsfieA. ^ee »v»a« a 
' 7 !1*anBorlptloB aiwJ Nudelo Acld-bindiag proteiasi' Helvetica ',265 COLOR 0 

SaiEEW I WPS^O JSEfUWO fSorean l* AT 40;a'eiZB. 386,492 PIXEM FOOT •fl«iMva* 7 mrnv aha 
Ust OFF fields tmtoer.D,7,Z.R.wm.S.liBSCFamR.IlC^-j^ 

SCSm^JY^O EBADnC -screen 1- AT 40,2 SiZE 286.492 PIXELS " FOOT •HeaVetica-,265 COUsi.O 

SOami WPE 6 HE«p& 'Screen !• AT 40,2 SIZE 286,493 PIXELS PCWP •Geneva', 7 coum Hon' 
.list OFF fields «inib«,0,P>Z,R>SRIsy.;6lOESCRiPra^^ *''0'«< 

f^cS^pSo^^ ''""^ ^"'^^ .«»^ -Helvetica.. 265 ODUJR 0 

ff!?iJJ; iP?^''^^^?^^ ^screen l" AT 40,2 SIZS 286,492 PIXELS FQJW 'Geneva'! 7 COLOR n 0 0 ' 
list. OPT fields awtoar,D;P,Z,R,E»nRr,S,DESCRIPI0R.8QPRSQ.8raOT,jS.I ' • ' 

^'^tLTS^e^j^^ ^' "^■285.492 PIXELS FOOT -Kelvettca'. ,265 COLOR 0 

SCffiENl TpS 0 JJEapING 'Screen 1' AT 40,2 SIZE 286.492 PIXELS FOOT ■Geneva'' 7 mrm 0 o n 
list OPP fields liai4»r,D,.P,Z,R,E!TO«,S,ISSa«PT0R,B3PlBQ,^ffi 

SCHEai 1 TlfPE 0 HEaDUB 'Screen 1' AT 40..2 SIZE 286,492 PiXELS . POW 'Helvetica ', 268 COtOR 0 



7 * aizniES' 

7 



ffepS^lL?^ '^"^ ^' -Helvetica'.zes COLOR 0 

eCIffiENJ. WPE 0 "HEJtonJG 'Screen I' AT 40,2 SIZE 286,493 PIXELS PONT "Gerieva' 7 corrm o o o 
list PP? fi^dS _nu«ber.p.P.Z,R,a»niy,S,DESCiaWCR^ 

-f^^^^^St:^' .Helv«tica.,265. fftt<« 0 

BCraEll OTPE 0 HEADING -ScteBa 1' AT 40,2 SIZE 286,492 PIXELS PCHT 'Geneva'. 7 camn 0 0 0 
list 0?P fielda nuinber/0.P,Z,R,EOTR!f,S,D3SCRlW0R,BGFREQ,HFEND 

f^At^j^^^ulf^r. -Helvetica..265 COWR 0 

SOffiEWl WPE 0 HSftDDlG 'Screen 1' AT 40,2 SIZE 286.492 PIXELS FOOT •Geneva' 7 OMR 0 0 ft 

list OPP fi«lds.nuiSber,D.P.Z,R.EIITRy,S,nBSCRIProR.aGPRBQ.^^.IWKO,l PO^ ' ' ' 

fSaJ SlLSlW*^* ^' "™ -Helvetica'.afS COLOR 0 

SCREEN 1 TYPE 0 HEftDIKJ "Screen 1' AT 40,2 SIZE 236.492 PIXELS PONT "Oeneva' 7 wirnp Ann 

Hat OPP fields niaiber.D.P,z,R,BfrRY.S.DESCRlPiOR.BOTREQ.^.iS "'5'°' 

fS»5 ^ Je^SlSa;?*""" ™^ •Rel>;«tic«'.26S OLOR 0 

SCHEEM 1 TWE 0 HEADIHG •Screen !• AT 40,2 SlZS 286.492 PIXELS PGNT -Geneva',? COLOR ff.COi 
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list OFF fields nundberiD«F,z,R,B7m,s»rESCRXProH«BQFI^,R^^ FOR Rn*M* 

F^.^'^iS'"^??/^**^ !• AT 40,2 S12B"2a6,«2 TOfiLS WOT •M^fid^^h 5£qR 0 
? 'iJuclAic acid notAboliesi: •• . . ' 

BCREaJ l.^KPE O-aSMmJS -Bcreco'l* AT 40,3 SIZE 286,492 PIXELS FCNT 'GenevaV? COLOR 0.0.0 ' 
liBt. OFF fields wmbex,r>,F,Z,K,T!tm^,B,ISBSCSaFl!OfR,BaF^ rOR rb*(7' 

* . * . *. ♦ • 

Wwral wra 0 HEWmse -Screen !■ AT 40i3 SIZE 386,492 PDOSLS* Fwr •Helvetica -,3 65 COLOR 0 

u-BCRSar i imi 0 HEM)J»3 •screen !• AT 40,2 SIZE 286,492 PIXELS PCOT •Geneva*,? COLOR 0.0.0. 
;.,rlist-OFF fields Bund)er,D,F,Z,R,ENncr,9,DSSCiaPX0R,fiCFBEQ,R?END,RA!^ iCR R»*W* 

iSCRSEK 1 TYPE 0 HEADINB "fiereerr !■ AT 40,2 SIZE 286,492 POD5LS FONT •Helvetica ",3 65 COLOR 0 
V y . 'Oth er m^ynBa;.* . . . . ^^i*-*^ v 

f?™^ ^ SP?.^ HSMINCS ■Screaa I- AT 4Q,2 SIZE 286,432 PIXELS TOTT -Geneva', 7* COLOR 0.0.0 
,>Ut OFF. fields aptiber,D,P,z.^^ 



SCRm i;T5ffB 0 ^HE&D3W8^* Screen PIXELS FGNT "Helvetica-, 3 68 COLOR 0 



-. , 



:scRpi 1 OTPE 0 HEftDIOT •Screen 1- AT 40,3 SIZE 286,492 PIXELS FONT •Helvetica' , 26S ODLOR 0 
Str ess aggponaei * * • 

scraari rmv beaduo 'screen x- *t 40,3 aizs 286,492 pixels font •a«s»«va",7 color 0.0.0. 

lirt OFF fields awnber,D,P,Z,»,anicr,S,KKCRlPTaR.BGERBQ,RP^ lOR Rb'h' 



fSlJtS:? X' AT 40,3 SIZE 286,492 pixels' FCOT •Helvetica^,365 COLOR'O 

???2J; HEMJ1N9 -Screen !• at 40,3 SIZE 286,492 PIXELS *»DNT -Geneva',? COLOR 0,0.0. 

list OTP fields. nurtber,D,P,2,R,amff,^,MSCRlPT0^ Ra'K* 
« • 

6C^ 1 WPS 0 m^ms -screen 1' AT 40;2 SIZE 286.493 PDffiLS PCaw 'Helvetica '.365 COLOR -0 
7 • 'ocn cr clones: ' * 

SCREEN 1 WPE 0 5EAD2M3 "Screen !• 'aT 40,2 SIZE 286,492 PIXELS ' raW" •G€»eva" 7 CQLOR 0 0 0 
Ust OFF fields mmiber,D,P,Z,R,am«,S,ttBSaaiTOR,BcaPRao.«PM SXJR Rb'X* ' * 



N 1 rrePB 0 HEAD3N3 'Screen I' AT 40,2 SIZE 286,492 PIXELS PONT 'Helvetica ',2 65 COLOR 0 
? 'Clc naa-of unknown fimctioni • • . , v 

'^^^ •screen i- AT 40,3 SIZE 286,492 PIXELS FOOT 'Geneva', 7 COLOR ■0,0;0, 
Ust^OfV fields nu»ber,D,P,Z,R,ENIOT,S,nESCRIPrOR,BGPRBQ,RFEOT. Era R-'U' 

EITOUSE 

CO 'Teat print -prg" 

SEP pR iwr Q gy 

SET .DESVXCE to SCREEN 



ERASE tCESflPLIBtSBF 

.ERASE 'msBiim.mT 

E^^ASB 1EMPDSSIQ.DBP 
SET MARGIK TO 0 



Q2DO0 
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northern (fiiji0le) , ver«ion 11-2S-M 
el088 databases 
SET 13dJ^ OFF 
£ET PAIOT OFF' 
6&r OFF 



• TO Dobject 



STORE .• » TO EbbijBCt 

SIQRE • ^ 
CTDR E 0 TO NUsiS. 
STORE O'TO ZOg 
^SXORE. l.TO Bail . 
. DO WHItiS .T. 

: Program. I Morthern (fllUffle) .fiat 

^ • Verfiioft.1 .Ft»fflA5B+/Mac/ rAviaion i.lO 
♦ Kptes.v*..! .Foaaat file Korchazn (single) 

. • POT^ 15.81 TO 46^7 BrM^2M47 C&l Sof-f "is«of5S ^ •««»«va.,ia COtm "0,0,0 

S "2'*^* ®^ 28447 COLOR 6.6,0 -25600 -1 -1 

^ llSSf lll'??,^"^ «S36 TOW 'OeSSS;. la'cOLOR 0,0.0.-1 -1 -1 

i t^BQFi Northern tsiagleKfiiit 
BEAD 

' IF BailB2 

OBAR . 

screen 1 off 

'RBIURN . 



■rfiJarteiw«Fop<aMB*/HaciPox £ile8 1 Lookup. Obf 
2F Bo^ eoto' 

STORE UPPER (Eoyect) to Eobjeob 
OTT SftFETg OEP 

SORT .CN Bxtzy TO "Loolwp entry.dbf • 

SET SAPm CN . ' 

U£B 'Loo)ci9 et\bzy.dbf * 
MATS FOR LobkeBobject 
iF .•NOP.FOOKDO • 



LOOP 



BROW SE 

STORE Entry TO Searchval- 

CLOSE DASABASBS 

BRASS **Lo03ccp entry.dbf" 



•IF-DQbjeet^' • 

GST SD ftCT QFP"-* ' ■ ' - • 

SOT fift^OT OFF 

SCOT' C N de aeriptor TO •LoQ3Qfl?' descriptor, dbf 
S?T SaPBfTV On 

pSE "Leo)cup descrlptor.dbf ■ 
CtEftR ' 
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BROWSE 

STDRB Batzy TO aearetaval 

SRA5B "LooJcup descriptor. dbf' 
SET cncT W ' ■ 

SNQZF • 

* 

"IF MWboO 

QSS 'SmortOuytFooaftSSt/llacsFoK £iles: clones. dbfi* 
'ER0M55 ' ^ " * '''^y.^--: . ^ 'y^y :'J . ' .-j 

vezcsiB i Sntxy iftarohval' 



;C££ftH*. . .. 

^? <Kdre2iezn azi&ly&la £or attry ' 
Seafdlaval 

*'? •Bwcr-Ttdiji»e*efid' 

WAIT TO dC ' 
CLEAR 

IP .OT?ER (OK) oi'Y* ' 

scre en 1 off . 



' O0ld?R£SSI0M'SUaS0OnMB FOk Lifara2fy,db£ 

7 'Conpressisg the Libraries file sow;-. • * . 

USB *6 togrt GMysPexBASS»/Mac;Pox filesxlltararids.dbf 

SE3? SAFSTY £97 * / « 

SORT ON library ,TD "Caqpreaaed libraries. dbf* 

FOR GOtereM)' 
SET SAFBiy CM 

nSB "Coo^essed librarles'.dbf* 
TEUSTB EOR exitere&sO 

OOONT TO TOX 

130 KHXLE SW2sO ROUi 
' ZF KARI^ >a TOT 

; RACK . • • - ' • 

LOOP 



* STOR E library TO TESTA 
' gKIP . - 

STOiRS Library TO ixstb 

,XF TE9TA s TESTS 



ENDIF 

MARKl Q l$ARKl-f 1 
lOOP ' 
QCOO BOUi 

* Nortbezn axa&Sysis 

CLEAR 

7 'Ooin? the oorthezn sow. . » 
6&r TAUC ON 

USE ' eraart OuyiFoxSAgB^/MaetFox filesiolones.dbfi' 

SET SAFETY OFF 

OOFY TO "HltB.dbf. FOR sDtxy«Bearcfaydl 
SET SAFETY CN 
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*" ' " " - - - • . - .... I . - - ' )V» i ■ ; , . . \ t\ 



* tASISR ANALYSIS 3| VERSION 12*5-94 

.* ^toatar^ .menu .for enoly^ia.. output ^ , • ^ . ... , , , 

SET iSlIX OFF 
SET SAFEIV OTT 

fflT 'OSVZCE TO SCRESt^ 

SETjDBPAUUr TO ■SitartQuyiPoxaASE+/Mac: £ox files lOutput programs i" 
USB "finartGuy!FQxaASE^i-/Mac: fox. files; Clones .dbf? v - - ^ 

SldRE KUMBSR .TO ZNITIATE; 
CO. BOTTOM ' 
STOgE NUMBER TO TERMINATE 
OTOTt E 0 TO EC^TXRS ' 
STOIRE 0 TO CCNDEM 
STORE 0" TO ANAli 

STQRB 0 TO EMKEOi " ' ^ " '■^"^^"'^ ' - ' ' ^■'^^OY. 

STORE. Q TO HMATCH . 
STOR E 0 TO OMATCH " ' ' 
STORE 0 TO IMATCH 
STOS E 0 TO KMATCK ' 
STORE 0 TO FRINTCa? 
S lUHK O TO MP 
DO WHILE :.*T. 

* Crogram/: If^stsr analysis. &nt 

* Eate...«: 12/ 9/gf4 

* Version, t PoxBASE^'/Mac, revision 1.10 

* Notes* • • • } Pozmat file Master anetlysis 

f^^i ° HSADINQ -Screen 1' AT 40,2 SIZ3 .286,492 PIXELS FCm -C3eneva^9 COLOR 0,0,0, 

Q PIXELS 39,255 TO 277,430 STVTLE 28447 COLOR 0,0,-1, -25600, -1, -1 
© PIXELS 75,120 TO 178,241 STYLE 3871 COLOR 0,0,-1,-25600,-1,-1 

$ PIXELS 27,98 say "Customizea Output Menu- STYLE 65536 FCMT "Geneva', 274 COLOR 0,0 -l.-l -1 
« 255^ ^^'^^ conden STYLE 65536 FONT 'ChicagoM2 PICTURE "0*c Condensed format- SIZE 
t ^SSE^ nii^^^ ^ S^^^ ^5536 FONT •ChicagoM2 PICTURE »@*RV Sort /number; Sort /entry i 
e PIXELS 117,126 GET EMATCH STYLE 65536 FOOT ■ChicagoM2 PICTURE •6*0 Exact • SIZE 15,62 CO 
S.|5S^ 135,126 GET HMATCH STYLE 65536 FO^F^ •ChicagoM2 PICTDRS -Q^C Horoologoufi" SIZE 15.1 
Q PIXELS 153,125 GET C»1ATCH STYLE 65536 PONT ''ChicagoM2 PICTURE ■©♦C Other spe- SIZE IS 84 

8 PIXELS 90,152 SAY "Matches! ■ STYLE 65536 FONT -Geneva', 268 COLOR 0,0,-1,-1,-1,-1 

e PIXELS 63,54 GET PRIOTON STYLE 65536 FOOT 'ChicagoMS PICTORB "3*C Include clone listinff- 
Q PIXELS 171,126 GET Imatch STYLE 65536 PONT ■ChicagoM2 PICTURE ■S*C Inqyte- SIZE 15,65 CO 
Q PIXELS 252,146 GET initiate STYLE 0 PONT •GenevaM2 SIZE'15,70 COLOR 0,0,-1,-1,-1,-1 

9 PIXELS 270,146 GET tenninace STYLE 0 FONT •Geaeva",12 SIZE 15,70 COLOR 0,0,-1,-1,-1,-1 
0 PIXELS 234,134- SAY ."include . clones ■ STYLE 65536 FONT "Geneva",!^ COLOR 0,0,-1.-1 -1 -1 
e PIXELS 270,125-ay •->• STYLE 65536 FaTT ''Qen«yaM4 COLOR 0,0,-1,-1,-1,-1 

Q PIXELS 198,126 GST PIP STYLE 65536 FOKT •ChicagbM2 PICTURE •@*C Print to file- SIZE 15,9 

6 PIXELS 189,0 TO 257,120 STYLE 3871' COLOR 0,0,-1,-25600,-1,-1 

e PIXELS 209,8 SAY "Library selection' STYLE 65536 PONT 'Geneva*, 266 COLOR 0,0,-1,-i -1 -1 
0 PIXELS 227,18 GET ENTIRE STYLE 65536* FOOT •ChicagoM2 PICTURE "©♦RV All; Selected' SIZE 16 

* BOP: Master analysis.fmt 
READ 

IF AMAL>9 
CLEAR 

CLOSE DATABASES 
BRASS 7EMFMASTER.DBP 

USE ''&nartGuytFoxBASE-i-/Mac:fox flLesi clones. dbf 

SOT SAFETY ON . 

SCREaj 1 OFF 

RETURN 

KNDIF 

Clear 

7 INI TIATB 
? TERMINATE 
7 .CONDEN 

7 fNUL 

5 8 
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esnatdh 

,r:ttnatoh;: ^:r\ . 

? Omtoh ' 

SET TALK ON ^ ' " 

; IP ENTniEn2 ■ ' * " -w.-.-'^^ ^ 

USB 'Uhique libraries :dbf« 
' REPLACE ALL 'i WITO • * • * ' ' ; • ' 
; ^BROWSE- FIELDS i, lihnamft, library, total, entered AT 0,0 

roEVStaari:G^y:FaxBAflE+/Mae:fo)c files tclones.dbf" 
.*OOPy ro*'P2MRNUM'P0R^N0MBER>»I»ITIAra 
*U3S TEMPKUM 

cop y flTRUCTORE TO TEKPLIB 
USE TGKPLZB 

IP amRB«i ' 

, - APPEND ?ROM.»ainartGuy:PosaASB+/Macifox files: Clones, db^' 
: ENDIP 

IP EWPIREto2 
USE "tteique libraries. dbf' . 

COPY TO SELECTED-TOR UPPSR|i)ii»Y» .. . .... 

". USE SELECOSD ' " '^'''v < ' ^ r r-sr . 

STDRB .RECCOUNTO TO flTOPlT 
!' MARKal 

DO WKILE .T. 
tP MARIOSTOPIT 
CLEAR . 
' EXIT 

USE'SSLECTED 
GO MARK 

STORE library TO THISCNE 
7 •COPVTNG ' 
?? THISOME 

USB raiPLIB . • . . 

sS°iaKl'TO^S^'^°'^^^'^^°'^'^ files:Cloaes,dbf- FOR libraxyi^TOISCJME 

vod? 

SNDDO 
ENDIF 

USE ''arciarcGuy:Po>CBASE+-/Mac:fox filestclonea.dbf • 
COUNT TO STARTOT . 
copy STRUCTORE TO TEMPEE5I0 
C7SB .d^lPDESIQ 

IP Etaarch-0 .AND.. HmatehssO .AND. QratchsO .AND. BMCH«0 
^ APPEO PROM TEMPLIB 
- ENDIP 

IF Emacchsi 

APPEND PROM TSa^PLIB FOR De'E' . 
ES3DIP 

IP ttnatchel 

APPEND PROM TEMPLI3 FOR Ds'H* 
QJDIP 

I P gn atcbal 

APPEND FROM TOKPLIB FOR Ds*0' 
SNDIF 

' IP^oatchsi 

APPEND FROM TEMPLIB FOR D=* I ' .OR.Do «X' .OR,D»'N' 
IP »iatchnl 

APPEND PROM TEMPLIB FOR Oa»X' 
EWIF 

CCONT TO ANAT/rOT 
set talk off 

DO CASE 

5 9 
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SBTtDEVICS .TO PRDfT 

SET PRINT CN: • c: c- v; -* ' 



SET. ALTERNATE TO "Total function Bort.txt" 
•SET:AUreitNftTE TO;°H and 0 function sort.txt" 
*SBT AI/TERNATE TO -«Shear.:StresB-.HUVEC.'2sAtaundar.c« i»ort;txt' 
*SErr .At/PSTOJATE TO "Shear Stress HUVEC 2iAbundance con.tJcf 
*SET,AWSraOOT. TO ."Shear, Stress HOVEC 2:Punction sort.txf 
*fiBT ALTERNATE TO "Shear Stress HOVEC 2: Distribution Bort.t5Ct" 
★SET ALTERNATE. TO "Shear stress OTVBC l;Clone Ust.txf 
♦SET ALOSRNATE TO "Shear Stress HDVBC 2jliOcatian eort.txt*" 
SBT ALTERtJATB'QM , 
ENDGASE 



IP PRINTON=1 . - . / , ' 

©1,30 SAy "Database Subset Analysis" STYLE 65336 TOOT "Gaftsva",274 OOLOzl 0.0.0.-1,-1,-1 

ENDIP ' • - . r: . V ■ i » t t I 

? 

7 

? dateO 
?? TIM8{) 

7 * Clone- numbers V 
??.9TR(1N1TIATS",6,0) 
77 ' through • 
?^? Sra(TERMINATS,6,0) 
7 'Libraries) * 

IP ENKREal ' 

? 'All: libraries' 

ENDIP - • . 
IP E^3TIREa2 
. MARlUl 
■ DO WHILE .T. 
IF MMUOSTOPIT. ' 

EXIT - V - ^- ^ • 

amip 

USE SELECTED 
: GO MARK 
. ? ' » 

* f7 raiM(lihname) 
STOS^ ^ARK^l TO MARK 
LOOP . * . . ' 

r -.- EtDDO 
STOP . 

? 'Desionaeionfl: ' 

IP BratchaO ..AND.. HinatchaO -AND. Ctoatch=0 .AND. lMATCH-0 
B^IP 

IF Bxatchsl 
77 "Exact, ' 

ENDIP 

IF HRtatoh-1 
7? 'Humam, ' 
SNDIF * 
IF Onatchsl 
?7 'Other .sp.V 



IP Iznatchal 
7? 'INCME' 
ENDIP 

IF ^teatehsl 
7? "EST' 
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7. 'Condensed fomac .analyais* 
iV. NUMBER* . 



.1 M- 



? .'Sorted ty ENiW 



? ''Arranged tay ABUNDAKCS' 



t. V- 



f .V' 



.•-5 



IP 

? /Sorted by IMrERSST' . 
ENIJIP 

IP AtC^S . 

7 !Arranged.)3y LOCATim' 



1 . 1. 



IP ftNALa.e V- . ^ 
? .'Arranged by DISTOZBUTZGN* 
£NDZF . 
IP ANALo? 

? 'Arranged by FUNCTICN.': 
HNDSF-'- "-k:''^' \. . -u 

7 'Total clones represeiitedj ' f 
?? STO(CTAR3OT.6.0) . . 
? 'Total clones analyzed-* * 
?? SXR(ANAi;iOT,6»0> . 
? 

7 'l-, a library d = designation £ « distribution z = location r « function c cer 

USE TE MP0B3IQ 

SCREEN 1 TOTE 0 HEADING "Screen V AT 40,2 SIZE 286,492 PIXELS FOOT "Qeneva"*,? COLOR 0,0,0, 
DO CASE • 
CASE AKALsl 

* sort/nuafber 

SET HBADINa ON • .. . 

IF OQNDEMal 

SQRT TO T®JP1 CN ENTRY,NOMBSR 
DO "CCMPRSSSIQb? aURiber.PRG' 



SORT TO TEKPl W NUKaSR 
OSB TQfPl 

list off fields lusitber«L,D,F,Z,R,C,a7TRy,S,D55CRIPTQR 

♦list off fields nuinber,L,D,P,S,R,C,Ensy,S,DSSCRIPT0R,LBZ3TH,RFEaro,IOT^ 
CLOag DATABASES 
ERASE TQIPI.CSF 
fiI3DIF 

CASE AKALs2 

* florn/OESCRlPTOR 

SBT HEADING ON 

* SORT TO TQiPl ON DESCRIPTOR, ENTOT, NUMBER/ S for Da'S' .OR.I>»'K' •OR.D«'0' .OR-D^'X' .OR.Da'l' 
•SCOT TO TOaPl CN ENTOT, DESCRIPTOR, NOMEER/S for D-'E' .OR.D«'H''.OR.D»'0' .OR.D«'X' .OR.D*'I' 
SQRT TO TEMPI ON ENraY,START/S for Da'E* .OR.D=<H' .OR.D='0' .OR.D='X' .OR.D»'I' 
IF OQNDGI^al 

DO ■COMPRESSION entry. PRG* 



USE nMPl 

list off fields number, L,0,P, 2, R,C,raTRy,S, DESCRIPTOR, LENSTH,RFaro,INXT, I 
Crnsfe DATABASES 
E RASE TSKPl.DBF 
GNDZF 
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t r8or& Jay abundance 

SOOT TO TEKPl ON amVri^^ 

ro •OQ&IPRESSICH abundance. PRG* . 

? ebrt /inter eat . ,r , 
SET HEADING W 
IF CQNDEKsl 

SORT TO TEMPI ON mi!BX,imSER FOR Z>0 
DO 'COMPRESSICEN interest . PBS" 



SORT Ctt?;l/D,EOTRY TO TEMPI PCR I>1 
USB TEIPl, 

list off fields nuiriber,L,D/P,Z.R,C,£amV',S,DESCRIPTOR,r.EN^ 

CUOSE CATASASES- 

ERASE TE^4Pr. DBF -.-r.' 

E^©IP - 



CASE ANAL»5- . . .. . 

* arranse/location ^ . ^ . 

SET HEADIMe CM . . . _ 

STORE 4 TO AMPLIFIER .7 . ^ ^ ■ ' , 

? 'Nuclear: • " ' - -''-•'■0>^ O-^--:- 

SORT ON ENTro^,NU^!BER FIELDS RFE^D.N^flfflER,L»D,F, 2, R,C,amiY,S, DESCRIPTOR. LaOGTH^IOT 
IF .0C2rCEN=l 

DO, "Cosnpression location. prg' .v \; V . " 

DO "Ncarmal siibroutine 1* 
EKDIP ' 

? .'^^oplaonics ' 

SORT CN ENTRY,N11M3ER FIELDS RPEND,lWMBER,L, D,F,Z,R,C,ECmy,S,DESaaproR, LETOT^ 
IF COtSJW^X 

DO "Conpression.location.prg* 



DO "Norxfial aubroutine 1* 

EKDIF 

? 'Cyt'oakeieton: • 

SORT CM EMTRY^NUMBER PISE^ RPa©*NU^BER,L,D,F,2,R,C,HH^ro,S,I3ESCRIP10R,LE2iGra,OT 
IF CQ(Q3^^1 

DO ^Ccnpressibn location. prg" 
ELSE 

DO •Norxnol. jBubroutine 1" . 



? 'Cell Burface: ' 

SORT CN ENIKY,NUMBSR FIELDS RF2OT,NIJMBER,L,D,F, 2, R,C,ECm«,S, DESCRIPTOR, LStWlH, 
IF CCNDENsl 

DO "COTpression location.prg" 
ELSE 

DO •Uonnal subroutine 1' 
ENDIF 

? 'Intracellular membrane : • 

SORT CN rNTRYiNUMHER FIELDS R?S^©,NUMBER,L.D,F,2iR,C*ENTRY, S, DESCRIPTOR, LE^TOTH, COT, I, OQM^ 
IF QQNDOIsl 

DO "ConpresBion location.prg" 
DO "Nonnal subroutine 1" 

ENDIF 

? 'Mitochondrial:* 

SORT ON anRY/NUMBER FIELDS RFQ© »NUIiaER,L|D,P, 2, R,C,EbnRy,S/ DESCRIPTOR, LDroTa,INIT, I »COMM^ 
IF CCNDEKal 

DO 'Canpresflion location.prg" 

ELSE. 

DO •Nonnal subroutine 1" 
&1DIF 

6 2 
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fS^JSSJ^'^'*^^ FIELDS RFEND.,NlMHER,L,0,P,2.R,C,Emy,S,DESCRnTOR,ti^^ 

DO" "CcafQwreaaioii ' location. prg* 

DO "kosnal fiuhroutlne 1" 

EMDIP;,:'^,,^ • -JV • 
? rotheri* " 

MRT^Waroy, NUMBER FIELDS RFEra),NUMBER,L,D,F,Z,R,C,ENTOY, S,DESCiaPTOR, LENGTH, Xm,^ 

IF .-'OONDBMel ; >;v. 

DOr^Cojiprdaflion'locatibii.pra" ' 

BLS£/;' 

DO-»Nonnal subroutine !• 

ajDIS":" 

? ■ ;«Ji3mowni ' 

FIELDS RFato,NtMBER,L,D,P,2,R,C.EtnRY,S,DESCRI!TOR,LENGTC,I^ 

DO "CoBpression location .prg* 
ELSE 

DO 'Normal subroutine 1' 

ENDIF ; • • 

IF CQNDSNsl 

SSV DEV ICE. TO PRINTER 

SET PRINl^ ON 

EJBCT 

DO •Output heading.prg' 
USB -Analysifi location. dbf 
DO 'Create bargraph.]^" 

SET-HBADINO OFF 

^ , i FUNCTIONAL OASS TOTAL UNIQUE NSW % TOIAL' 

LIST OF? FIELDS Z,NAME. CLONES, GaiES, NEW, PERCENT, GRAPH 
CLC^ DATABASES 
ERASE TEKP2.DBP 
SET HEAOINQ ON 

*USE ''StaartGiyjPQxaAS3+/Mac:£ox filesiTEKEMASTER.dbf 
EUDIF ' 

* 

CASE ANALtrg 

*r .arranga/aietribution 

SET KBAOINQ ON 

STORE 3 TO AMPLIFIER 

? 'Cell/tissue apecific distribution?' 

SORT CN ENITOr,NUMBER FIELDS RFEM),NUMHEa, L,D,F, 2, R, C, ENIOT, S, DESCRIPTOR, LEK(mi, IN^ 
JF OONDBNsl 

DO "Coscprassion discrib.pry* 



D O^ 'N ormal subroutine I" 
SKDIF 

7 'Non-specific distribution! • ' • ' 

MRT ON EOTRy,NUMBER FIELDS RFEITO, NUMBER, L,D,F, 2, R,C,E^^Ry,S, DESCRIPTOR, liEMGTO,lNCT 
IF CQNDENsl 

DO "Caasaresaion distrib.prg* 



DO "Nortnal aubroutina 1" 
ENDIF 

7 'Unlaiown distribution: • 

SORT CN ENTRY,NUMBER PlEtDS RPEND, NUMBER, L,D,P, 2, R,C,EnRV,S, DESCRIPTOR, IS5GTH, INIT, I, COM^ 
IF CONDEKsl 

DO 'Ccarprefision distrib.prg" 



DO "Nozral subroutine 1" 
QIDIF 

IF CQNDEKel 

SET DEVICE TO PRINTER 

SET PRS^TER ON 
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DO . "Out^nit heading.prg' 
U3S 'Analyslfl distribution . dbf • 
DO^'Creatfs bargraph.prg: 
,SHT KSAPim OFF 

yUKCnONMi CLASS TOTAL UNIQI3B % TOTAL' 

LIST OP? PIELDS'P.NMffiiOiONES^QENESrPERCENT.GRAPH 
CLOSE DATABASES 
ERA5£;TEMP2 vDBF 
SET HEWING ON ' 

•USE",»SinartGuy:PoxBASE+/M4c:£ox files :'EEMEWaSTBR. dbf " 
HNDZF ' 

CASE A£IAL°7 
^fftixange/function . 

SSP BEADING GN 

STORE 10 TO AMPLIFIER 

' BIKDING PROTEINS' 

? 'Surface molecules and receptors s' 

S^asSE^T^'*™^ R^™D'NUMBER'UD,F,2,R,C,ENra 

DO •Cctrpression functlon.prfl" 



DO 'NoxxQ&l subroutine 1" 
SKDZF 

7 'Calcium-binding proteins: ' 

IF^CqSS^^'-^^^^^ R^'NCJblBER,L,D,F,Z,R,C,EtmiY,S,DESCRlP^^ 
DO ■Ccfflttiression function •pro" 



DO "Nozmal eubrdutine 1" 
7 'Ligands and effectors i ' 

^CoS;^^'^^^^™ "ELDS RFE©,NUMBER,L,D,F,2,R,C,an^Y,S,DSSCRlFTOR,L5N^ 
DO •CcCTpression function.prg* 



DO ^Normal svihroutine !• 
OIDIF 

7 'Other binding proteins j» 

«roND,MCJMBER,L,D,P,Z.R,C,Ermty,S,DESCRnTOR.LEr^ 

DO •Compression function .prg" 

DO "Normal subroutine.!" . 

ENDIP • 

•EJECT 

?- CNOOGENES ' 

7 'General oncogenes 

JJ^J^jj^^Y'NOMBrR FIELDS RFS^m,^3llMBER,L,D,P, 2, R,C, ENTRY, S, DESCRIPTOR, LE^ 

DO "Con^ression iunction.prg" 

ELSE ' " 

DO 'Normal Subroutine 1" 
ENDIF 

7 'GTP-binding proteins! ' 

ffwSS^^'^'""^ FIELDS HFEI©,NlMER,L,D,P,Z,R,C,ENTRy,S,D3SCRIPTaR,LBNOTH, 

DO '•Conpression function.prg* 
ELSE 

DO •Normal subroutine 1" 
ENDIP 

7 'Viral elements! • 
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DO "Normal subroutine I" 

EKDIP 

?'';'Kltia)sos and Phosphatases: ' 

FIELDS RPEND,miBE»,I*,D,?,Z,R,C,Eamiy,S,DBSckPTOR,I^^ 

cCb •Cdtapreasion £unction,prg» , 



DO "Komal subroutine 1' 
? • Tumor-related antigens t * 

IP COS^DENbI ' ' 

DO "Canpreasion function. prg* 



DO "Moanad subroutine !• 
♦EJECT 

7 ' PROTEIN STiNIHEnC MACHINERY PROTEINS' 

7 

7 * Transcription and Nucleic Acid-binding proteins:* 

SS^^EIJ^^^^'^'*™^ FIELDS RmiD,NUMH3R,L,D,F,3,R,C,EN?RY,S,DESCRIPT0R,IiErOTH,J^ 

DO 'Coinpreasion function.prg* 



DO "Normal subroutine 1' 
7 'Translation! ' 

NOMBER FIELDS RPEKD,NUyBra,L,D,F,Z,R,C,EI?mY,S, DESCRIPTOR, imjTK^INIT,^ 

IF OQNDENal 

DO *Coic3»res8ion furiccion.prg" 
ELSE 

DO •Normal subroutine !• 

5NDIF 

7 *RibosccLal x^roteins: ' 

SORT ON £NTOY,NtIMB3R FIELDS RFEt©,NUMm,L,D,P,Z,R,C,EtmiY,S, DESCRIPTOR, LENGTO^INIT,^ 
IF CCNDENsl 

DO *Ccos>re58ion function.prg" 
ELSE 

DO "Normal subroutine 1" 
ENDIF 

? 'Protein processing! ' 

SORT ON ENTRY, NUMBER FIELDS RPEND,NtWBSR,L,D,P, 2, R,C,Etmiy,S, DESCRIPTOR. I*BrOTH,INW 
IF CQNDENsl 

DO "Ccnipressirai function. prg". 



DO "Nozmal subroutine 1" 
QIDIF 

♦BOECT 

? * ENZYMES' 
7 

? 'Ferroproteinsi ' 

SORT ON ENTRY,NOMBER FIELDS RPEtTO, N13MBEK,L,D,F,Z,R,C, ENTRY, S, DESCRIPTOR, LENGTH, INIT,^ 
IF CQNDENsl 

DO "Conpression function .prg" 
DO "Normal subroutine I' 

ENDIF 

7 'Proteases and inhibitors:* 

SORT ON eiTRY,NDMBER FIELDS RPa©,NUMBER,L,D,P,Z,R,C,E2m?y,S,DE2SCRIPT0R,imn!H,INrr 
IF CCNDSNsl 

DO "Conpression function .prg* 
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ijr 



';•;? ';• Oxidative phosphorylation: ' 
'JX> •Conpreaaiott function.pry 



DO, "Normal subroutine !■ 
ENDIP. 

7/'!Sugar 'jnetabolismi • 
D0'"Conpres8ion function. prg* 



PO ."Nonaal subroutine !■ 



'Amino acid metabolismj * 

DO ■COBjiression 'iuactioa.prg' 

DO 'Mbrmal subroutine !• 
£NDr? 

?. 'Nucleic acid metaboliami ' 
, ra^^OT]^ FIELDS RFaro.WUMBro,L,O.F,2,R,c;EmY,S,D5S^ 

DO •Conpression function .prg" 

ELSE 

DO "Normal siibroutine 1* 
/SHDIP 

* ? "Iiipid netabolism: * 

^^^'^K^'I-.p.F.Z^R^C^OTTlY^S^DESCRITO^^ 

bo 'Concession function .prg* 

ELSE 

DO -Norxnal subroutine 1 • 
? *OthGr enzyxaesi* 

S^oSS^'^™^ ™^ «^'^««MBER,L,D,F,2,R,C,Em7lY,S,DESa^^ 

DO ■Compression function.prg" 
SLSE 

DO "Normal subroutine 1" - 
ENDI? . . " 

♦EJECT 

\ • " MISCEXIANEOTS CATEGORIES* 

7 'Stress 'response: * 

S^oSIsS^'''^®^ J^'^E^'t''D,F,2,R,C,EmOT,S,DESCRIPTOR,LEN^ 
DO 'Coit3>res3ion functioh.prg" 



W 'Noiinal subroutine 1" 
Q?DIF 

7 ' Structural I • 

^^^^ *'™'N^^MBE^'L^D'J''2,R,C,Emy^ 

DO 'Conpression function.prg" 



DO "Nonnal subroutine 1' 
ENDIP 

7 'Other clones i • 

glWOT^TlY.NUbffiER PIELDs'rPEI^^ 

DO •Conpression function «prg" 
ELSE 
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r?;{ AClones; of unlcnoMn func tion i ' 

^00 rconpreflflion function •prsr" 
IX) ■Korinal: subroutine !■ 

♦SE!T DEVICE TO PRINIBR 
*SET 'PRINT C3N 

JX); ^Output heading .prg" 

.USE 'taalysia function. dbf" 

» •Create bargraijh.prg* 

iSET KEADI^ 0^. 
.^***' 

^SCTSa? a ,TJf?B 0 HSADIMG "Screen 1' AT 40,2 SI2S 296,492 PIXELS TONT "QanBvaM2 COLOR 0,0,0 

1 I ' 

4 r TOEftL ZDZAL NEW DIOT 

7 . : FUNCTOMO, CXASS CLONES GEN3S GENES FUNCTICMM, CUSS' 

* 

.*L25T 0^ FIELDS P, NAME, CLONES, GENES, NEW, PERCENT, GRAPH, CCMPAOT 
LIST OPr FIELDS PiNftME, CLONES, C3E3JES, NEW, PERCENT, GRAPH 
CLOSE DATABASES 
ERASE TB1P2,DBF 
SET HEADINQ ON 

♦USE '*SEcartGuy:PaKBAaB+/Mao2fox files :TEMPMASTER. dbf • 
ENDIF 

CASE ANALsS 

DO "Sxibgroup surraary 3,prg" 
SNDCASS 

DO "Test print. prg" - 

SET PRINT OFF 

SET DEVICE TO ^RSEN 

CLOSE DATABASES 

♦ERASE TQiPLIB.DBP 

*ERASE TQ1PNUM.DBF 

♦ERASE TEMPCSSIGtDBF 

*ERASE SELfXnro.DBF 

CLEAR 

LOOP 

SNDDO 
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* CQMFHBSSIQN SUBRODTZigB FOK ANALYSIS PS^OQKA^ 
.USB TEMPI''*- ' V'sv - 

. OOOM r '/TO .TOT 

'repxacs all bfend with 1 

MARKl =1 ^ ■ ^' ■'■ 
6W2bO 

SO WHILE SlQsO KOLL 
IF MARKl >B ,TOT 
PACK 

00017? TO UNIQUE 

XXmr TO NENQE^ FOR Ds *H' .OR.Da '0' 
.SM2al 

LOOP 
. 'SZ9DXF 
60 MARKl 
ODPal- 

STORE ENTOY TO TSSTA 
SH e 0 

DO W HILE SWsO TEST 
SKIP 

8T0RS ETTTRY OD XBST8 
jp TESTA s TJilSTB 
TTRTiKTE 
CUP B COPi-1 

LOOP • 
QIDIF 
GO MARKl. 

:rsplace .rf£(3D wpni ddp 

MARKl « HARKi-t-roP 

.SK=1 

LOOP 

EMDDO TEST 
LOOP 

SNDDO ROLL 
'GO TOP 

STORE Z TO LOG ' 

USE •Analysis location. db£" 

LOCATE FOR 2«L0C 

REPLACE CLONES WITH TOT 

'REPLACE QWZS KCTH UEHQUE 

REPLACE NEW WITH 23EWQENES 

USE TQIPI 

SORT ON RPEND/D ^TO TEMP2 

USE TEZ<P2 - ' 

?? STR(UMIQaE,5,0) 

?? • genes,; for a total of * 

?? STR{TOT,5,0) 

?? ' .clonea* 

? ' V Coincidence' 

list off fields number, RFEba),L,D,P/Z,R,C,SimiY,S,IffiSCRIPTOK|LEbOTH,liaT,^ 

•SET PRINT OFF 
^SE DATA3ASES 
ERASE TS^l tCDF 
ERASE TS^. DBF 
USB TEMPDESIO 
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* CGWRBSSXCN 5aBIU)DrZNS FOR'^-i^NS^VSIS ^PROQRAHS : 
U5& TEKPl 

COUNT TO ror 

REPLACE ALL HFEND wrrH 'l 

DO \fHII£ SH2aO K0LL ' 
IP-MARK1"'>= TOT 
PACK 

comr TO maQUE 

LOOP 

ENDtP 
GO MAKKl 
WP = 1 

STORE EtTTHY TO TEST^' 
6» - 0 

DO tmZLE 'SWaO TEST 
SKIP 

STORE Eimnr to testb 

IF raS TA s TE5TB 

DEZ^TE 

CUP o 

LOOP 

• WDTF 

GO hark! 

REPLACE RFEND WTIH DUP 

XMICI « H\RKl-fDOP 

SW=1 

LOOP . 

ENDDO raST 

LOOP 

SKDDO ROLL 

•BROWSE 

•♦SOT PRINTER OM 

SORT QM DATE TO TEMP2 

USE TEUP2 * - 

?7 STR(XJia^^;4,0) 

77 • genes, for a total of* 

7? STR(T0T,4,0) 

77 cloneo* 

7 

7 ' V Coijicidence • 

COQOT TO P4 FOR I«4 

IF P4>0 

7 STR(P4,3,0) 

7? ' genes with priority s 4 < Secondary analysis:) ' 

list off flelOc number, RPEt©,L,D,F, 2, R,C,ENIRy,S,DSSCRIPTOR,LaiGTH,INIT for 3«4 
SNDIF 

COUNT TO P3 FOR I«3 

IF P3>0 

? STR|P2i,3,0) 

?? • genes with priority ■ 3 (Full insert sequence: ) ' 

list off fields number. R?EtDiL.0,Fi 2, R.C,ENIR>f,S, DESCRIPTOR, La^GTKiINIT for I»3 

t 

ENDXF 

CCOMT TO P2 FOR 1=2. 

IF P2>0 

? Sro(P2,3.0) 

77 • genes with priority ■ 2 (Primary analysis cosnplete t ) ' 

list off fields nuinber,RFQro,L,D,F,Z,R,C,EDmnf,S,DESCRIPTOR,LENGTH,IN^ for l»2 

ENDIP 

raClNT TO Pi FOR lal 

IP P1>0 
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?,STR(P1,3.0) 

fl52«!S>^5l> P*M?i«iy". .(Priaiiii;: .^Ij^kii. needed. ) • 
^jBt^o-. .«ieldfi,,winfl9er,S?EbC,E,D,p;z,R/c,Emv,^ for isi 



4^ 



CLOSE = DATABASES 
E3USE\rraWPl . DBF ^ 
EiaS2''miP2.DBF^' '^-'^^ ■ 

l^E.- • fimarcGi^yr i FcoBASEt/M&c i fox files » clones . dbf • 
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* GOK?RESSZW SUBROUTINE :FOR AKA3^^SZS . PROGRAMS - 
GOUWTO:TOr . '1..-: • ; 

REpy^,Aix-.RraND wiTOri- ■ ATr- * ' r"';,' - 

MAHK1^= 1 

DO tnULE. SW2sO' ROLL 
IF MftRKl >s TOT 

OODKT.TO UNIQUE. ; 
LOOP 

GO NARKl 

COT a-l 

fiTORB EOTRV TO TESTA 
fiW » 0 

DO WHILE*. SWaO TEST 
SKIP/. 

STORE EWTRY TO TESTS 
IF TESTA s, TESTB 
DELETE 

DUP e EqP+1 . 

LOOP ^ - • ; 

QOIF - 
GO MAPJCl 

RSPL&CE RFEND VOTR DUP 
KftRKl e.KARKl^DUP 
SWbl . 

LOOP v 

ENDDO'TEST 

LOOP 

mSDO ROLL 
^BROWSE 

*SET PRINTER ON 

SORT QM NUMBER TO TEMP2 

USE TEMP2 

?? STR(UMIQUB,4,0) / - 

7? ' genes, for a total of • 

77 Sro(TOT,5,0) . . - r . 

77 » elcQciea* 

^ • .V Coincidence* 

list off fields nuxitoer,RFSND.L,D.F,z,R,c,Ewm,s,DESCRiPTaa,LE^ 

♦SET PRINT OFF 
CLOSE DATABASES 
ERASE TEbiPl .rap 
ERASE TEMP2 .DBF 

USE •SmartOyyjFoxBASBt/M&o:£ax files: clones. dbf» 



7 1 



wo 95/20681 



PCT/US9S/01160 



*: pOKPRSSION SUBHOUTINS FOR Aia^OiVSIS PHOGRAMS 

COUOT ' TO 'roT' 

KEFIACE'AUi''RFEND WTTK 1 

MARKl 1 ' 

SW2 gO 

DO HHttiE SW2oO*-ROLL - . r ' • ''^ ■ > - ; ■ ; - ■ . : . \, . • • • 

-^''■OOtJNr^TO -UNIQUE 

^^^COUNT TO NEWSENES FOR D=»H' .OR.D='0' 
6M2«1 
LOOP 

GO 

TOP - 1 

ST0R3 Et^Y TO TB^A 

sw b 

ro NKUiE SW=0 TEST 
SKIP 

STORE aroV TO TESTS 
IF TESTA r TBSTB 

DUP = DOP+1 
LOOP 

GO MTJUQ' 

HEPIACE KFE2ro WITO DUP 
HftRKl « KT^l^OOP 
SWsl 

UOQP 

£NDDO TEST 
LOOP 

ENCffDO im- 
GO TOP 

STORE R TO FUNC 
USE "Analysis function -dbf" 
LOCATE FOR P»FUNC 
-PEFLACS OJX^ WUH TOT 
HEELACE GENES WITH UNIQUE 
PEPLACE NEW WXOH NENGQ^. 
USE TEMPI 

SORT ON HFEND/D TO TS3^2 

USE TSMP2 

SST HEADINQ W 

?? STRCUNXOOE^S^O) 

?? ■ genes, for a total o£ * 

77 STR(TOr,5,0) 

77 ' clones' 

7 ' . ' V Coincidence' 

list off fields xA2mber/RFE2^,L,DiFrZ/RfC.Qniiy,5,DE5aa?rOR, LENGTH 
«** 

♦SCREEN 1 TYPE 0 HEADING 'Screen 1- AT 40|2 SIZE 286,492 PIXELS FGOT *6enevaM2 COLOR 0,0, 
*li8t c£f fields RFE^.S, DESCRIPTOR 

*SET PRINT GPP 
CLOSE DATABASES 
ERASE TQCPl.IHF 
ERASE TECT 2tg!BF 
USE 1SMPDESIG 
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*„ CCMFItSSSZON SUBROUTINE FO^ WfdJfSlQ FROGHAKS 

USB Tafla . . ; 

D0.1i^U£ SW2aO ROUi 



IF MARKl >» TOT 

6W2'nl' ' 



GO MARKX , . 
CTORE ENTRY. TO TESTA 

sw - -0 : o , - 

CO vmXJB SMsO TEST 



(1 -i Tt' -^f* 



•i f ■ 



STQR B eWTR y. TO TSSTB T.- 

IF TESTA " TESra • ,\ r. 

pgT.prrg 

DUP = DOPfl 

.-LOOP ■ - r, . * 

GO MARKl 

X^PLACS.RFE^ Wim DOP 
MARKl a.WiRKl+DUP- • - 

SW=1 
IXX3B 

3SNDD0 ^SST 
LOOP 

ENDEX) ROlIi 
GO TOP 

STORE P TO DIST 

USE 'Analysis distribution .dbf 

ZXXZATC FC^ PsDIST 
REPLACE CLONES WITH TOT . < 
REPLACE WITO nNIQOE 

USE TEMPI 

807t m rfend/d to 

USB TaiP2 

?7 STR(UNIQUB,5,0) 

77 * genes, for a total of * 

77 ETR(TQT,5,0) 
77 ' clones* 

? ' V Coinaidencfi' 

list off fields nuraber,RPEEro,L,D,P,Z,R,C,EJnOT,S,DESCRIPIOR,LENC?ra 

*SBr PRINT GPP 
CLOSE DTlTftBASES 



- 'A 



ERASE TSMP2.PBF 
USB TOgOBSIG 
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;> COMPRESSION SUBRDOTlffi FOR' ANALYSIS spROGWIS * '~ ^ 
UaS TEMPI 

COUNT TO TOT 

RSFUCE ALL RF^^fD WITK 1 

UARKl « 1 

SW2«0 < 

CX) >2HIL£ SW2sO BOLL 
/ IF MARKl >« TOT 
PACK . . : 

: • COUNT TO tINlQUB • 

aiDip 

00 MARK! 

DUP i 1 ^ : 

ST0R2 ENTRY TO TESTA 
SW*;0 

DO WHILE" SWnO TEST 
SKIP 

STOPJB a^lZRy TO ISSTB 

IF TES TA s TESTS 

QEl^IE 

DUP .B DUP^l 

LOOP - 

^IF 
GO MARKl 

REPLACE -RFEND WITH DUP 
MARKl, s MARKl+DUP 
SW=1 ' . 
LOOP. 

SNDDO TSST 
LOOP 

ENDDO-ROIi' 

GO TO? 

USB TEMPI 

?? STR (UNIQUE, 5,0) 

77 • genes, for a total of • 

?? STR(TOT,5/b) 

7? • clones* 

• . V Coincidence* 

list Off fields nurriber,RFEND,L,D,P,Z,R,C,mOT,S,DESCRIPTOR,L^^ 

*SET PRINT OFP 
CLOSE DATABASES 
ERASE TSMPl.DB? 
USE T£»PDE8IG 
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fj^a:»masssiw suhrootine tqk j^ysis programs 

' . files .-Clones. dbf 

' COPY 'TO'TSMPl FOR 

- USB TEMPI ; • ' - • ' ■ ' 

FOR D«.N« .QR.D=.D- .OR-D-'A' .QR.D=^U' .OR.D-'S' .OR.Dn'M' .OrIS'R' .QR.&'V 

COOMr TO TOT 

REPLACS AU^ KSWO WITH 1 

MARKl B 1 . 

SH2sO 

DO WOILC SW2aO ROLL 
IP MARKl >= TOT 
PACK 

OOUOT TO UOTQtJE 

SW2al 

tOOP 

HNDIF 
GO KftRKl 
DUP » 1 

STORE ENTRy TO TESTA 
SW s 0 

DO W HILE SJfcO TEST 

Store ewerv to tests 

IP 'fiESTA o TESTS 
DELBTE' 
X3UP B DC7P+1 
LOOP 
ENDI? 
GO MARKl 

REPLACE RFEND WITH DOP 
MARKl a MARXl+DOP 
SMbl 
LOOP 

EMCliDO TEST 
LOOP 

ENDDO ROLL 
*BROWSB 

♦SET PRINTER ON 

SORT ON RFESJD/D,2JUMHER TO TEMP2 
USE TEMP3 

REPLACS AIL START WITH RPEND/IDGENB*10000 

?? STRraHQUB.S^Oj 

7? • genes, for a total of • 

77 Sra(TOT,5,0} 
77 ' clones' 

7 * Coincidence V v Clones/lOOOO ' 

set heading off 

SCREEN 1 pE D HEADING 'Screen V AT 40,2 SIZE 286,492 PIXELS TOTT -Geneva',? COLOR 0 0 0 

CLOSE DATABASES 
ERASE TEMPI. DBF 
ERASE TEaf?2.DBF 

USE *SmartGuy:PoxBASEt/Mac:fox files: clones. dbf 
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: ^ CGMPRESSICW SUBRDOTINE FDR ANALYSIS PROGRAMS 
eUSE TEMPI 



0 



■miOT. TO IDGSNB FOR D='S' •OR,Dr*0' .OR.Da'H' .OR.&»N* •QR,D='R' .OR-Da'A- 
mEIB FOR Da•N'.GR.D='D^OR.Ite•A^OR.D=•U^OR.D=•S^OR.D«•M 



- PfiCK ■ 
OOCOT TO TOT 
BEFIACE AIZj RFEND WITO 1 

'•MARKi c ,r r.^ V . : 

iv DO WHILE . SW2eO ROLL 
r * IF MARKl TOT 
PACK 

COUNT TO UNIQUE 

SW2al 

LOOP 

E^IF 
GO MARXl 
DOP o 1 

SITOE ENTRY TO TESTA 

S»f B 0 

DO WHICjS SW^O TEST 

ggp 

SIOTS ©?ERY TO TESTB 

IP raS TA = TESTB 

QEXiETB 

DOT - DUP+1 

LOOP - 

WDXF 
GO MARKl 

REPLACE RPEND WITH DUP 
MARKl a MARKl-cDOP 
SM=1 
LOOP 

E13DD0 ISST 
LOOP 

ENDDO ROLL 
^SRONSE 

♦SET PRBJESR ON 

SORT OW RFEND/D, NUMBER TO TEMP2 
USE TSM?2 

REPLACE ALL START WITH RFDJD/IDGIWE* 10000 

?? STR (UNIQUE, 5,0) 

?7 ' genes, for a tdtal gf • 

7? STR(TOT,5,0) 

77 • Clones' 

7 * Coincidence V v Clones/10000' 

Bet heading off 

SCRpN 1 TYPE 0 HEAb^ -Screen 1" AT 40,2 SIZE 286,492 PIXELS TONT -Geneva', 7 COLOR 0,0,0, 

iif^ £i®^^' nuinber,RFBND,STaRT,L,D,?,z;R,C,Eim?Y,S,nESCRIETOR;n4IT,I 
*SET PRINT OFF . 

CLOSE CATA3ASES 

ERASE TEKPl.DBF 

ERASE 7EMP2.DBF 

USE ■SinartGuyjFoxBASE+/Macifox files i clones. dhf" 
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USE, TEMPlj 
?? v*Total of' 
77. ' Clones? 

r - ' ■' 

*list off fields nuIIlber,L,D,F,a,^^,CrE^^RY,I3E^CRIPTCKl,E^ 

list off fielda"nuinberjL,D,F,zvR,C#ENiTO;i^ • ■ ' • ■ ^ - - 

CLOSE DATABASES " 
ERASE . TSM? 1 .ISF.- 
USE ;TSKPGESZG . 
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*Li£e8can naanii; version 8-7-94 
SET TAiik OFF " 

Bet device to screen 

CLEAR/:; , 

USS 'SinartGuyrPoxBASBt/Meicifox files: clones. dbf" 

prORE LOTCMEO TO Update 

GO'BOTTCM 

STORE RECNQO TO caoneno = • 

STORE 6 TO Chooser 
DO WHilS ;Vt,- 

*!■■ ProgxeonV i liifeseq menu. fmt 

* Date,.,..t . 1/11/95 

* yeralon.: PoxBMB+/Mac, revision 1.10 

* Kotes. . i Fozxnat file Lifesecr menu 



SCREEN 1 WPE 6 HEADJMS -Screen' 1" Ar'46,2 SIZE 286,492 PIXELS FOOT "Geneva", 268 COLOR 0 6 
« Pims 18,126 TO 77,365 SlttE 2B479 COLOR 32767, -25600,-1?-! 62 23!-167^ ' ' 

e PIXELS 110,29 TO-188,217'STOiS 3871 COLOR 0,0, ^1,-25600 > 4,^1 ' ~ -^'^^^ ^ 

5 PIXELS 45,161 BAY "LIFESEO' STSfLE 65536 FXWT . •Geneva' ,536 COLOR 0,0,-1,-1,7135.5884 
! ^^'^^^ SAY'-™" SlYLB- 65536 SWT "Geneve M2^ COLOR 0,0,-1,-1,7135,5884 

t £5='5 ^I'ltl ■Molecular Biology ^Desktop' SlYLE^ 65536 FONT "Helvetica", 18 COLOR 0,0,0, 
e PIXELS 90,252 TO 251V467 SlYLE 28447 COLOR Oi 0,-1, -25600,-^^^^^ - .^o uwiiu^ u,u,v, 

8 PIXELS 117,270 GET Chooser STYLE 65536 FONT •ChicagoM2 PICTURS ' "O^RV Transcript profiles 
8 PIXELS 135,128 SAY Update STSfLE 0 FOOT •GenevaM2 SIZE 15,79 CQLGR^O, 0,0, -25600, -l7-l • 

8 PIXELS 171,128 SAY cloneno STSTLE 0 FOOT "Geneva M2 SIZE 15,79-CQLOR 0,0,0,-^25600 -L-l' 

9 PIXEL S 135,44 SAY "Last uqpdatet" STYLE 65536 FOOT *G€nevaS12 COLOR 0,0,-1,-1 -l.ll 
8 PIXELS 171,44 SAY "Total Clones i " ^ STXLE 65536 FCCT^ "Geneva ",12 COLCR 0,0,-l,-i,*l -1 
8 PIXELS 43,296 SAY -vl.SO" STyLS 65536 FOOT "Geneva" ;7 82 COLOR 0/0,-1,-1,-1,^1 

* EOF: Lifeseq menu.£nnc 

READ 
DO CASE 

CASE Choosercl 

DO ■ficnartGuyiFoxSASE+/Macifox files lOutput programs (Master analysis 3,0(rcT" 
CASE ChooBer=2 

DO "SraartGuyiFojt3AS&f/Mac:fox £ileB:Output programs! Subtract ion 2.prtr'' 
'CASE ChooseraS 

DO "SniartGuy iFoxBASE+/Mac:f ox filesiOutput programs :Northem (single) .oro" 
CASE Chocsera4 

USE "Lihraries.dbf- 
BRCMSE 

CAiSB Chooser's 

DO •SinartGty:FoxEASE+/Mac:fox filea:Output programsiSea individual clone. oro" 

CASE choocer-6 

DO •SroartGuy:PaxBASE+/Mac:fox files i Libraries! Output programs :Menu .pro" 

CASE Chooser=7 

CLEAR 

SCRE^ 1 OFF 
RE1VRN 

LOOP 
EMDDO 
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03(,3O SAY 'Database Subset Analysis' STVLE €5536 TONT -Geneva', 274 COLOR 0,0,0,-1,-1,-1 



[7^; ■ "• " ' ' 

? 'Clone numbers * 
-?? STR{INIT1ATB,6,0) 

•?7 sto(terminatb;6;o) ' ' ' " 

'liibrariest ' 

^7. fAll libraries' ' - 
EKDIF 

.IF. QfnRE=2 

v' '-.iMAHKal . ^ . • , . 
. -DO WHILB .T. ' ^ 
IP MARIOSTOPrr 

EWDIF ■ ' 

VSB SELECTED 

GO HARK 

7 « ' 

77 TRIMdibname) 

STOKE taWC+l TO NARK 

LOOP 

ENDDO 

7 'Desifipiiationsi ' 

IF StaatchffO .AKD. ftnatch;=a .AKD. OnatchsQ 

77 'All* 

ENDIF 

IP Bnaeeh«sl 

?-? 'Exact,' - . - 

IF .Hznatchsl 
?? 'Human, • 

:'XF Gbiatchal 

?? 'Other ep. ' : ^ 

ENDIF 

IF CONZ3£33al 

? "Condensed format analysis' 

ENDtP 

IP:AKAL>1 

7' 'Sorted hy number* 

GNDIP 

■IF ANALs2 

? 'Sorted b/ ENTRY' . 

EMDIP 

IF ANALeS 

'? 'Arranged hy ABUITDANCE' 
^IF 
IF ANAL»4 

? 'Sorted by INTEREST' 
E^3DIF 

IF ANAL«5 

? 'Arranffed by LOCATION' 

ENDIF 

IF AZ4AL«5 

7 'Arranged by DISTRIBOTICN' 
ENDIF 

IF AtlALa? 

? 'Arranged by FUNCTICN' 
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? 'fltktal ^clones" ropresenced: 
77:-STO(STARTOT,6,0) . 

7 ITbtal: Clones :analy'2edi :*. ' 
??;ST!R(At3ALTOr;6;0) • 

7 ' 



••i " t * ' * ■ " 



• - '.,1. 



1 . . V,-'' 
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USE TEMPI 
OOONT TO TOT 
?? ' total Of 
?? Sro(IOT,4iiO) 

?7^;/';d:Laies* '■'*•■* ■ 

nifit'dff fields' nimiber,L,D,F,Z,RiC,miOT,i)ESCRIPTOR,l£TO 

list of f fields nuniiber,L,D,FiZ,R,C,E2!nOT,DESCRII^ 



USE TBCPDESIG 



I. 

{ - 



*( 



1 - 



■"1, 



1 J *, 
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1 'f 1 



OCXOT TO TOP , * .Vv , = 

' total Of '" ' -^r- 'U-..'.. - - ;s . 

?? STR{ror,4,0) 
?? ' clones! 

*Ust Off fields nuitoer,L,D,P,Z,R,C,EmY,DS9CRIITOR,2£NGW 

list off fields nurt(b«r,L,D,F;2iR/c,Et7niV,DESCRlPTQR r : . .- v.: "- ' L:" ^ 

CLOSE DATAEASSS 

ERASE TBfPlvDB?" ^ - • * := ^ . ; ■ ;■■ ; v-v .-.r.v;:;: 

USE TfMFDSSXO 
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^Northern (singlQ), version 11-25-94 
close, databases . 

SET TMjK off - ' ): . . ^ ' 

SET PRINT 0?? 

SET EXACT OFF ' -u -^. . » ■ . ; . . •■•( ■*•■ i ^ \ 

CLEAR 

STORE * 'TO Eobjeot ' ' * ' : : 

STORE ' \ , , . •TO Dobject 

fflGRE 0 TO KUnib 
STORE 0 TO Zog 
STORE 1 TO Bail 
IX) WKTLE >:V. : 

* Date... . 

♦ Version. 



Northern (single). fint 

8/ 8/94 ' ^' • • s'.'; • jp 

FojCBASE+ZMao, .revision 1.10 



* Notes.. :vr Ft>rinat file Northern Msinfifl^) . ' ^ 

* * • * 

SCREEN 1 «PE 0 HEADINQ "jScareen 1' AT 40,2 SIZE 286;492 PIXELS PONT "Geneva •'.12 COLOR 0,0,0 
Q PIXELS 15,81 TO 46,39^.BTmB 28447 COLOR 0,0,-1,-25600,-1,-1 
0 PIXELS 89,79 TO 192,422 STYLE 2B447' COLOR b,0;0, -25600,-1,-1 
3 PIXELS 115,98 SAY "Entry #:": STO*B 65536 FOOT •C3enevaM2 COLOR 0,0,0,-1,-1,-1 
@ PIXELS 115,173" GET Eobject STYLB 0 FCEHT •C3eneva'',12 SIZE 15,142 COLOR' 0,0, 0,-1,-1,-1 
e PIXELS 145,89 SAY "Description' STYLE 65536 FONT ?Geneva",,12 COLOR 0. 0*0, -1, -1, -I 
Q PIXELS 145,173 QST Dobject STYLE 0 PONT »QenevaM2 SIZS 15,241 COLOR 0,0,0,-1,-1,-1 
Q PIXELS 35>89 SAY "Single Northern search screen^. STYLE 65536 FOOT •Geneva", 274 COLOR 0,0,- 
e PIXELS 220,162 GET Bail STYLE 65536 PONT "Chicago", 12 PICIURS •3*R Concinue;Bail out' SIZE 
0 PIXELS 175,98 SAY "Clone ff!" STYLE 65536 PONT "Geneva M2 COLOR 0,0,0,-1,-1,-1 
8 PIXELS 175,173 GET NlKrib STYLE 0 FONT •'GenevaM2 SIZE 15,70 COLOR 0,0,0,-1,-1,-1 ' 
•8 PIXELS 80,152 SAY "Enter any CNB of the following:* STYLE 6SS36 FONT '"GenevaM2 COLOR -1, 

* BOS': Northern (single). fiat • • ' 

IP Baile2 ' ' ■ ' ■ ■ - 

CLEAR 

scre en 1 off 
RSTORN 

USE **SnartGuy:FoxBASE^/MactFox files: Lookup. dbf 

SET TALK- CN - ' 

IF Sobjecto' . • ..... . ... . 1" : . 

STORE UPPER (Eobject) to Bobject • 

BET SAFETY OSV 

SORT O N En try TO "Loolcop entry.dbf* 

SET SAFETY ON 

USE "Lookup ectry.dbf" 

LOCATE FOR Loo)c^'Bobject 

IF -NOT.FOONDO 

CLEAR 

LOOP . , - 



STORE Entry TO Searchvjal 
CLOSE DA!I7^BASBS 
E RASE "LobkL^'cntry.dbf " 
ENDIP 

IP Dobjecto' ' 
SET EXACT OFF 
SET SAFETY OFF 

SORT ON descriptor TO "Lookcqp descriptor. dbf* 

SET SAFETY On 

x^SB •Loo)cup descriptor. dbf • 

WCATE FOR UPPER(TRIM(deflcrlptor))aUPPER(TRIM{Dobdect)) 

IF .NOT.FCONDO 

CLEAR 
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ENDIP 

BROWSE ' • ' ' 
STORE Entiy TO Searqhval , 
CU)8E EATftBASES ' ' * ' " 
ERASE "Loolcup descriptor. dbf 
SKT E3CACT ON 
ENDIP 



n > - ' i 



V - 



IP NtoiboO ^ 

USE •finBrtGiy:PQXBASB+/MactFo?e filesVclciaes 
GO Kunib ' • ; ' ■ . ; 

BR0W5B 

STORE Entry TO Searchval ' ' : 
ENDIP 

CZ£AR , 

? 'Northern' analysle -for entry ' . - ' ' ' - ' - 
?? Searchval 

? 'finter y to proceed* , . 

WAIT TO OK' ' 

CLEAR 

IP UPP2R(0K)<>*y« 
screen 1 off 
RETURN 

ENDIP • , - - ... ; . 

* OOMPRESSICN SUBROUTINE -FOR Library, dbf 
7 'CQxnpreasing the Mbrariea file now. . . * 

USE ' S nart Gvay > PoxBA8S'» /Mac i Fox files :Xibrarie8« dbf 
SET SAFETY OFF 

SORT W library TO 'Cosrpreeeed libraries. dbf 

* FOR ente red>0 

SET SAPSry ON = J 

USB *Conpreased . libraries .dbf ' 
DELETTE FOR entered«0 - ' - • 

PACK 

COUNT TO TOT' 
KXRKl el 
SW2ttO 

DO WHILE SW2sO ROLL . 
IF MARKl >a TOT ^ 
PACK 

6W2al 

LOOP 
ENDIP 
GO MARKl 

STORE library TO TESTA 
SKIP 

8TORE Library TO TESTS 
IP TESTA s TESTB 
DELETE . . , 

MARKl m I4ARK1+1 
LOOP 

ENDDO ROLL 

* Northern analysis 
CLSAR 

7 'Doing the northern now. . . ' 

SEP TALK W 

USB •SmartGuy:PoxaASB*/Mac:Pox filesjclonea.dbf* 
SET SAPSnr OFF 

copy TO 'Hits. dbf • FOR entryasearChval 
SET SAFCTV 
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CLOSE 'CWIMASBS ^^'^ ^f-oi^.^u " p i:. f.:;..:^^ 'i:;-^ I .! pj^^uj.-^: .^r-j- 
fiELBCT 1 . . . _ ,,1 , . . 
USB ■Canpross^' li^raiies ;abf ' ' ' 
STORE RaOOOC3OT();TO Entries^ 
SELECT 2 

DO WHILE iT^ v ^ o)- ./h loi^ ' r^; c ^ ' .r : k V.- Vr.-* : r^Mu^c- 

SBLSCV 1 

IF Mark>Eatries / :^v.'- >v ./:r; ;-.^Mrr'^-;: 
. ... ..... 

GO MARK . 

STORE lihrary TO' Jigger ' i*. 

SELECT 2 ; . 

OOUOT TO 2og FOR llbrB2^/*=Jioger' ' 

SELECT I c . . r ,i : . ^ ^ 

REPLACE hits with Zog 
MarkaMarktl - 
LOOP 
E^IDDO- 

BROWSE FIELDS LI^RARV,LIENAME,£27IERED^HITS AT 0,0 
CLEAR 

? 'Enter Y to print: ' 

WAI T TO FRINSET ' ' 
IP UPPER (FRINSET)»*Y' 

SET PRINT CN • j • ' 

CLEAR 



SCREQI 1:TYPE O.HSADING "Screen 1" AT. 40 #2 SX2E 28^ 1 492, PIXELS FCm "G6nevaM4 CDLW 0^0,0 
? 'DATABASE ENTRIES MATCHING WHCi ' 

?? Secaxhval . ' t ; r> 

? DAilBO 

SCREEN 1 0 KSADZZro "Screen I" AT 40,2 SIZE 286,492 PIXELS FONT ■■Geneva\7 COLOR 0,0,0, 

LIST OFF FISUDS library, lihname, entered, hits 

? 

• 

SELECT 2 

LIST OFF FIELDS HJKBER,LIBRARV,D, 6, F,Z,R,227n<^, DESCRIPTOR, R?START,S^ 
SET TALK OFF 
SET PRINT OFF 
ESDIF 

CLOSE DATABASES 
SET TALK OFF 
CLEAR 

DO "Te st print .prg" 
RETURN 
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TABLE 6 



library 

AOENINBOI 

ADRENOT01 

AMLBNOTOI 

eMARNOTDI 

BMARNOr02 

CAHDNOTDI 

CHAOMOTOt 

OORNNOTDI 

n3RA0TD1 

RBRAQTQ2 

F1BRAMT01 

FEHNGT01 

FtaRNQTCfi 

RBPNOT01 

HMC1NOTD1 

HUVEtPBOl 

HUVENOB01 

HUVESTB01 

HYPONOB01 

KIDNNOT01 

UVRMCmoi 

LUNGNOTD1 

MUSCNOT01 

OV1DNDB01 

PANCNOT01 

prruNORoi 

PnVNOTTOI 
PUCNOB01 

SPIWFETDI 

SPLNNOT02 

STOMNOT01 

6YNORAE01 

TBLVNOTD1 

TESTNOTCn 

THP1NOB01 

"THP1PEB01 

THPIPLBOt 

UB37NOT01 



libname 

Inflamed adenoid ^ >- 
Adrenal gland (r) 
Adrenal gland (T) > . 
AML blast eeUs 00 
Bonemenow > 
Bone marrow (1) 
Cardiac muade (T) 
Chtn. hamstftr ovary 
Corneal ctroma . 
Ftbnoblaei, AT 5 
Ftbrobiaat. AT 30 
Fibroblast. AT 
Fibroblast, yy 5 
Fibrobtaeti uv SO 
Rbroblast . . . 
Fibroblast normal ' 
Mast can Una HMC-1 
HUVECtFN,7NF,LPS 
HUVEC control 
HUVEO shear stress 
Hypothalamus 
Kidney (T) 
Liver fn . 
Lung (T) 

Skaldial muaele (T) 
Oviduct 

Pancreas, normal 
PUuilary(r) 
Pituitary (T) 
Placenta ' 
Smatl trnestine CO 
Spleen^iver, fetol 
Spleen fT) 
'Stomach • • ' 
RhauriL synovium 
T B lympheblaat 
Testis fT) 
THP-1 control 
TUP phorbol 
THM phorbot LPS 
U937, monocytic teuk 



t .• r 



t f 



numberlibrary d a f z r entry 

2304 U837NOT0t E H C C T HUMEF4B 

3240 HMC1NOT01 E H C C T HUMEFiB 

3269 KMC1NOT01 E H C C T HUMEFiB 

4693 Ht^CINOTOI E H C C T HUMEFiB 

8989 HMC1NOT01 E H C C T HUMEFIB 

9139 HMC1NOT01 E H C C T HUMEFIB 



deterlptor 
Elongation lador 1*beta 
Bongalion (ador 1-beta 
Bongaiion factor l-bata 
Elongation factor i-beta 
Elongation (actor i*beta 
Elongation factor i-beta 
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WHAT IS CLAIMED TS i ' ■ • • • ■ a - i- .'S 

1. A method of analyzing a specimen containing gene 
transcripts, said method comprising the steps of: 

(a) producing a library of biological sequences; 
^ iP) generating a set of transcript sequences, where 

eachiibf ^the transcript sequences in said set is indicative 
of a different one of the biological sequences ot'tl^^^ f 
library; 

■ * "(c) prociessing the transcript sequences in a *L 
10 programmed computer in which a database of reference 
transcript se^ences indicative of reference biological 
sequences is stored, to generate an identified sequence 
value for each of the transcript sequences, where each said 
identified sequence value is indicative of a seguence 
15 annotation and a degree of match between one. of;, the-, f 
transcript sequences and at least one of the^ reference i 
transcript sequences ; and 

(d) processing each said identified sequence value to 
generate final data values indicative of a number of times 
each identified sequence value is present in , the library. 



2. The method of claim 1, wherein step (a) includes 
the steps of: ^ ' 

. obtaining a mixture ,of mRNA; m 
making cDNA copies of the mRNA; 
25 isolating a representative population of clones 

transfected with the: cDNA and producing therefrom the 
library of biological sequences. ^ 

3. The method of claim 1, wherein the biological 
sequences are cDNA sequences . 

30 4. The method of claim 1, wherein the biological 

sequences are RNA sequences. 

5. The method of claim 1, wherein the biological 
sequences are protein sequences. 
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6. The method of claim 1, wherein a first value of 
said degree of match is indicative of an exact match, and a 
second value of said degree of match is indicative of a 
non-exact match. 



5 7. A method of comparing two specimens containing 

gene transcripts, said method comprising: 

(a) analyzing a f irstjspecimen according to the 
method of claim 1; 

(b) producing a second literary of biological 
10 sequences; 

(c) generating a second set. of transcript sequences, 
where each of the transcript sequences in said second set 
is indicative of a different .one of the biological 
sequences of the second library;-- . ; 

' (d) processing the second, set of transcript sequences 

in said programmed computer to generate a second set of 
identified sequence values known as" further identified 
sequence values, where each of the further identified 
sequence values is indicative of a sequence annotation and 
20 a degree of match between one of the biological sequences 
of the second library and at least one of the reference 
sequences; 

(e) processing .each, said' further identified Sequence 
value to generate further ^f inal data values indicative of a 

25 number of times each further identified sequence value is 
present in the second library; and 

(f) processing the final data values from the first 
specimen and the further identified sequence values from 
the second specimen to generate ratios of transcript 

30 sequences, each of said ratio values indicative of 

differences in numbers of gene transcripts between the two 
specimens • 



8. A method of quantifying relative abundance of mRNA 
in a biological specimen, said method comprising the steps 
35 of: 

(a) isolating a population of mRNA transcripts from 
the biological specimen; 
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(b) identifying genes from which the mRNA was 
transcribed by a sequence-specific method; 

(c) determining numbers of mRNA transcripts 
corresponding to each of the genes; and 

(d) using the mRNA transcript numbers to determine 
the relative abundance of mRNA transcripts within the 
population of mRNA transcripts. 



9. A diagnostic method which comprises producing a 
gene transcript image, said method comprising the steps of 

(^) isolating a population of mRNA transcripts from i 
biological specimen; 

(b) identifying genes from^ which the mRNA was 
transcribed by a sequence-specific method; 

(c) ; determining numbers of mi^ 

15 corresponding- to each of the genes; and 

(d) using the mRNA transcript numbers to determine 
the relative abundance of mRNA transcripts within the 
population of mRNA transcripts, where data determining the 
relative abundance values of mRNA transcripts is the gene 

20 transcript image of the biological specimen. 

10. The method of claim 9, further comprising: 

(e) providing a set of standard normal and diseased 
gene transcript images; and 

(f) comparing the gene transcript image of the 

25 biological specimen with the gene transcript images of step 
(e) to identify at least one of the standard gene 
transcript images which most closely approximate the gene 
transcript image of the biological specimen. 

11. The method of claim 9, wherein the biological 
30 specimen is biopsy tissue, sputum, blood or urine. 



12. A method of producing a gene transcript image, 
said method comprising the steps of 

(a) obtaining a mixture of mRNA; 

(b) making cDNA copies of the mRNA; 
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^ {{.c) \ / inserting :the' ;cDNA!| intpv a ;,vsuitable vector and 
using said vector to transfect suitable host strain cells 
which are plated out and permitted to grow into clones « 
each clone representing a" unique luRNA; ■ - 
5 > (d) isolating -a representative ^p^ of 
recombinant clones; \-/ ' \ 

(e) identifying amplifiedicDNAs 'from each clone in 
the population by a sequence-specific method which 
identifies gene from which the unique mRN A was transcribed; 
10 (f) determining a number of . times; each gene is 

represented within the population of clones as an 
indication of rel^tiye* ;^bundance; and > 

(g) listing the, genes and their relative abundance in 
order of abundance, tiiereby producing the gene transcript 
15 "imagei^' _ ; 

13, The method of; claim; 12 / also including the step 
of diagnosing disease by: ^ " 

repeating .steps (a) through (g) on biological 
specimens from riandom sample of normal _and diseased humans, 
encompassing a variety of diseases, to produce reference 
sets of normal and diseased gene transcript images; 

.obtaining a test specimen from a human/ and producing 
a test gene transcript image by performing steps (a) 
through (g) on said test specimen; 
25 : comparing the test gene transcript image with the 

reference sets of gene transcript images; and 

identifying at least one' of the reference gene 
transcript images which most closely approximates the test 
gene transcript image. 

^0 A. computer system for analyzing a library of 

biological sequences, said system including: 

means for receiving a set of transcript sequences, 
where each of the transcript sequences is indicative of a 
different one of the biological sequences of the library; 

3 5 and 

means for processing the transcript sequences in the 
computer system in which a database of reference transcript 
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10 



sequences 'indicative of reference biological sequences is 
St oredy ^wherein the computer is programmed with software 
for^. generating an^uident for, each of the 

transcript sequences, -where each said identified sequence - 
5. value. is indicative of a sequence annotation and a degree 
of match between a different one of the biological 
sequences of ' the library arid at leas^^^ of the ref erence 
transcript sequences, and for processing each said 
: identified sequence ' value ^to generate » final /data values 
indicative of a number of times each identified sequence 
value is present in the library. 



20 



V, : 15. The , system pf . claim 14,. also Including: 

, library generation means for producing the library of 

biological sequences and generating ' said set of transcript 

1 ' ■ i ' - .-11, • - * 

15 sequences from said library.^ , 



16. The system of claim 15, wherein the library 
generation means includes: c 

means for obtaining a mixture of mRNA; 

means for making cDNA copies of the mRNA; 

means for inserting the cDNA copies into cells and 
permitting the cells to grow into clones; 

means for isolating a representative population of the 
clones and producing therefrom the library of biological 
sequences. 
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SYBASE database Structure 



V -ft. 



1 r 




Coiiaboratbr 
Number . 
Nsimc 
Aidless , 
Phone 



Cell Supplier 

Number 

Name 

Address 

Phone 

Fax 



Biological 



Source 



Number 

Tissue 

Organ 

Gender 

Age 

Species 

Race 

Pathology 

Disease stage 

Tissue weight 

Source 

Lot 

PO 

Comments 



Libra]y':^Preparation 



1* r-''/!t\i'-%^.- . 



» • -t 



1 •»■: ) 



i:' 



Treatment Link 
Treatment name 
Culture ID 



Culture 

Number 

Source 

Lot 

PO 

Date 

Cell density 

Quantity 

Protocol 

Treatment 

Comments 



mRNA Prep 
Number 
Culture 
Lot 
Date 
Lapse 
Quantity 
height 
Protocol 
RNA yield 
mRNA yield 
% yield 
Modifications 
Gel appearance 
Comments 




cDNA 

Construction 
Number 
Prep # 
Library code 
Supplier 
Type 

mRNA used 
veaor 
primer 
direaions 
Av size 
Date ship 
Catalog # 
Price 
Date rec 
Cloning sites 
Primary size 
Background 
Unamp titer 
Amp titer 
Host strain 
Genotypes 
Actio 

Amplification 
Comments 
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